EDUCATION 


3rd edition 


Ü 


George A. Ferguson 


o 


edge 
owe y 


"Т 
у | 


AL STUDENT EDITION 


INTERNATION 


McGRAW-HILL 
SERIES IN 
PSYCHOLOGY 


Consulting Editors 
Norman Garmezy 
Richard L. Solomon 
Lyle V. Jones 
Harold W. Stevenson 


Adams, Human Memory 
Beach, Hebb, Morgan, and N 
Von Békésy, Experiments in Hearing 

Berkowitz, Aggression: A Social Psychological Analysis 

Berlyne, Conflict, Arousal, and Curiosity 

Blum, Psychoanalytic Theories of Personality 

Brown, The Motivation of Behavior 

Brown and Ghiselli, Scientific Method in Psychology 

Butcher, MMPI: Research Developments and Clinical Applications 
Campbell, Dunnette, Lawler, and Weick, Vanagerial Behavior, Perform- 


en, The Neuropsychology of Lashley 


ance, and Effectiveness 

Cofer, Verbal Learning and Verbal Behavior 

Cofer and Musgrave, Verbal Behavior and Learning: Problems and Processes 

Crafts, Schneirla, Robinson, and Gilbert, Recent Experiments in 
Psychology 

Crites, Vocational Psychology 

D'Amato, Experimental Psychology: Methodology, Psychophysics, and 
Learning 

Davitz, The Communication of Emotional Meaning 

Deese and Hulse, The Psychology of Learning 

Dollard and Miller, Personality and Psychotherapy 

Edgington, Statistical Inference: The Distribution-free Approach 

Ellis, Handbook of Mental Deficie 

Epstein, Varieties of Perceptual Learning 

Ferguson, Statistical Analysis in Psychology and Education 

Forgus, Perception: The Basic Process in Cognitive Development 

Franks, Behavior Therapy Appraisal and Status 

Ghiselli, Theory of Psychological Measurement 

Ghiselli and Brown, Personnel and Industrial Psychology 

Gilmer, Industrial Psychology 

Gilmer, Industrial and Organizational Psychology 

Gray, Psychology Applied to Human Affairs 

Guilford, Fundamental Statistics in Psychology and Education 

Guilford, The Nature of Human Intelligence 

Guilford, Personality 

Guilford, Psychometric Methods 

Guilford and Hoepfner, The Analysis of Intelligence 

Guion, Personnel Testing 

Haire, Psychology in Management 


"ncy 


Hirsch, Behavior-genetic Anal; 
Hirsh, The Measurement of Hearing 

Hurlock, Adolescent Development 

Hurlock, Child Development 

Hurlock, Developmental Psychology 

Jackson and Messick, Problems in Human Assessment 

Karn and Gilmer, Readings in Industrial and Business Psychology 


Mr 


"— — án 


Krech, Crutchfield, and Ballachey Individual in Society 

Lawler, Pay and Organizational Effectiveness: А Psychological View 
Lazarus, A., Behavior Therapy and Beyond 

Lazarus, R., Adjustment and Personality 

Lazarus, R., Psychological Stress and the Coping Process 

Lewin, 4 Dynamic Theory of Personality 

Lewin, Principles of Topological Psychology 

Maher, Principles of Psychopathology 

Marascuilo, Statistical Methods for Behavioral Science Research 

Marx and Hillix, Systems and Theories in Psychology 

Messick and Brayfield, Decision and Choice: Contributions of Sidney Siegel 
Miller, Language and Communication 

Morgan, Physiologicel Psychology 

Mulaik, The Foundations of Factor Analysis 

Nunnally, Psychometric Theory 

Overall and Klett, Applied Multivariate Analysis 

Rethlingshafer, Motivation as Related to Personality 

Robinson and Robinson, The Mentally Retarded Child 

Rosenthal, Genetic Theory and Abnormal Behavior 

Scherer and Wertheimer, 4 Psycholinguistic Experiment on Foreign Lan- 


guage Teaching 
Shaw, Group Dynamics: The Psychology of Small Group Behavior 
Shaw and Costanzo, Theories of Social Psychology 
and Wright, Scales for the Measurement of Attitudes 
Sido’ i, Experimental Methods and Instrumentation in Ps 
Siegel, Nonparametric Statistics for the Behavioral Sciences 
Spencer and Kass, Perspectives in Child Psychology 
Stagner, Psychology of Personality 
nd, Introduction to Experimental Methods for Psychology and the 


Shaw 


chology 


‘Townse 
Social Sciences 

Vinacke, The Psychology of Thinking 

Wallen, Clinical Psychology: The Study of Persons 

Warren and Akert, The Frontal Granular Cortex and Behavior 

Waters, Rethlingshafer, and Caldwell, Principles of Comparative 
Psychology 

Winer, Statistical Principles in Experimental Design 

Zubek and Solberg, Human Development 


John F. Dashiell was Consulting Editor of this series from its inception in 1931 
until January 1, 1950. Clifford Т. Morgan was Consulting Editor of this series 
from January 1. 1950 until January 1. 1959. Harry Е. Harlow assumed the duties 
of Consulting Editor from 1959 to 1965. In 1965 a Board of Consulting Editors 
was established according to areas of interest. The current board members are 
Richard L. Solomon (physiological, experimental), Norman Garmezy (abnormal, 
clinical), Harold W. Stevenson (child, adolescent, human development), and 
Lyle V. Jones (statistical, quantitative). 


Calcutta $. 
{ о 
B. СУ Ж 


ANALYSIS IN 


McGraw-Hill Kogakusha, Ltd. Tokyo Diisseldorf Johannesburg 
London Mexico New Delhi Panama Rio de Janeiro Singapore 


Sydney 


ис сы 


STATISTICAL ANALYSIS IN PSYCHOLOGY AND EDUCATION 
INTERNATIONAL STUDENT EDITION 


Exclusive rights by McGraw-Hill Kogakusha, Ltd., for manufacture and export. 
This book cannot be re-exported from the country to which it is consigned by 
McGraw-Hill. 


I 
Copyright © 1959, 1966, 1971 by McGraw-Hill, Inc. All rights reserved. No part 


of this publication may be reproduced, stored in a retrieval system, or transmitted, 
in any form or by any means, electronic, mechanical, photocopying, recording, or 
otherwise, without the prior written permission of the publisher. 


ge of Congress Catalog Card Number 74-150459 
2i 


0505-1 í ; 


PRINTING Co., LTD. TOKYO, зағам 


CONTENTS 


| D 


Preface, xi 


PART I BASIC STATISTICS 


= 
won 


ee) 
Б Бодлер юг 


Basic Ideas in Statistics, 3 


. Averages. 44 
Measures of Variation, Skewness, and Kurtosis, 56 


Probability and the Binomial Distribution, 70 

The Normal Curve, 86 

Correlation, 96 © 

Prediction in Relation to Correlation, 107 

: Sampling, 120 

. Estimation, 136 

Tests of Significance: Means, 146 

. Tests of Significance: Other Statistics, 160 

. The Analysis of Frequencies Using Chi Square, 173 


PART Ш THE DESIGN OF EXPERIMENTS 


| 14. 


The Structure and Planning of Experiments, 197 


15. Analysis of Variance: One-way Classification, 208 


16 
17 


. Analysis of Variance: Two-way Classification, 223 
. Analysis of Variance: Three-way Classification, 246 


18. Multiple Comparisons, 268 


ша, - 
P TRENT SOS 


. Frequency Distributions and Their Graphic Representation, 25 


CONTENTS . 


19. Trend Analysis, 276 
20. Analysis of Соуапапсе, 288 


PART 111 NONPARAMETRIC STATISTICS 
21. The Statistics of Ranks, 303 


22. Nonparametric Tests of Significance, 321 


PART IV PSYCHOLOGICAL TEST 
А AND MULTIVARIATE STATISTICS 
| 23. Test Construction Statistics, 347 
24. Errors of Measurement, 362 
25. Score Transformations: Norms, 375 
26. Partial and Multiple Correlation, 390 
27. An Introduction to Factor Analysis, 404 


Answers to Exercises, 427 
Glossary of Symbols, 443 


е, APPENDIX TABLES 

К A. Ordinates and Areas of the Normal Curve, 448 

уз В. Critical Values of 1, 450 

e C. Critical Values of Chi Square, 451 

D. Critical Values of F, 452 

E. Transformation of r to z,, 456 

Е. Critical Values of the Correlation Coefficient, 457 

G. Critical Values of p, the Spearman Rank Correlation Coefficient, 458 

E І Н. Probabilities Associated with Values аз Large as Observed Values of S in the 
_ Kendall Rank Correlation Coefficient, 459 
1. Critical and Quasi-critical Lower-tail Values of W, (and their Probability 

5 Levels) for Wilcoxon's Signed-rank Test, 460 _ 

J. Coefficients of Orthogonal Polynomials, 463 5 


Tb 


К. Critical Lower-tail Values of R, for Rank Test for Two Independen 
ples, 464 1 

L. Critical Values of the Studentized Range Statistic, 468 

M. Squares and Square Roots of Numbers from 1 to 1,000, 469 


References, 479 
Index, 485 


cR “аз. 


PREFACE 


students and research workers in psychology and education to th 
cepts and applications of statistics. Emphasis is placed on the anal тк. 
interpretation of data resulting from the conduct of experiments Stud 554 
and investigators іп experimental medicine, psychiatry 50610 WD 
other disciplines should also find the book useful. р ue 

This book has been designed as a text for both one-semester and full- 
in statistics. In either case the instructor has some freedo: 
lection of material. For a one-semester course the selection 
e most of Chapters 1 to 13 and possibly sections of Me. : 


year courses 
choice in the se 
will usually includ 


of the remaining chapters. 
I have attempted not only to introduce the student to the practical 


y of statistics but also to explain in a nonmathematical 
ve way the nature of statistical ideas. This is not al Т 
easy. Obviously, the extent to which an understanding of statisti M e 
communicated without some mathematical knowledge is limit d e b 4 
high school or first-year college algebra will prove most helpf а . Skill in 
The writing of a book of this type demands games din Е А 
between а tidy, logical arrangement of material, sound Ри 
common usage, Which are not always compatible. The each oe tide 
ness has led to the inclusion of occasional sections which ан complet 
tial in an introductory text. The instructor can readily id c not еза 
d omit them if he chooses. y identify these sec 


technolog 
frequently intuiti 


tions ап 
Many changes have been made in thi: iti 
s edition, some im 
portant and зо: 
me. 


trivial. The book has been divided into fo 

Р н ur parts: Basi Ж Є 

Design of Experiments, Nonparametric Statistics, and Paschal ea, The 
Я ogical Test 


PREFACE ` 


and Multivariate Statistics. Four new chapters have been added: Analysis 
of Variance: Three-way Classification, Multiple Comparisons, Test Con- 
struction Statistics, and Factor Analysis. The section on the analysis of 
variance for repeated measurement designs has been expanded. The mate- 
tial on nonparametric statistics has been modified. Some material on now- 
obsolete computational methods has been deleted, and some rarely used 
statistical procedures have been dropped. 

The usefulness of this book is enhanced by the kindness of authors and 
publishers who have permitted the adaptation and reproduction of tables 
and other materials published originally by them. I should like to expiess 
my gratitude to Francis G. Cornell, Allen L. Edwards, J. Р. Guilford, Harry 
H. Harman, H. Leon Harter, R. W. B. Jackson, S. K. Katti, M. G. Kendall, 
John F. Kenny, Don Lewis, Quinn McNemar, Edwin G. Olds, George W. 
Snedecor, Herbert Sorenson, L. R. Verdooren, James E. Wert, and Frank 
Wilcoxon; and to the Scottish Council for Research in Education, the Uni- 
versity of London Press Ltd., Charles Griffin & Company, Ltd., Prentice- 
Hall, Inc., John Wiley & Sons, Inc., Van Nostrand Reinhold Company, 
Rinehart & Company, Inc., Iowa State College Press, The University of 
Chicago Press, Biometrika, and the Annals of Mathematical Statistics. 1 
owe special thanks to the Literary Executor of the Late Sir Ronald А. 
Fisher, F.R.S., to Dr. Frank Yates, F.R.S., and to Oliver and Boyd, Edin- 
burgh, for permission to reprint Tables III, IV, and VII of their book 
Statistical Tables for Biological, Agricultural, and Medical Research. 

I should like to express here my indebtedness to the late Sir Godfrey H. 
Thomson and to W. G. Emmett and D. N. Lawley, all of the University of 
Edinburgh. These three are responsible for my persisting interest in the 
applications of statistical method to psychological problems. In particular, 
I should like to express my gratitude to Lady Thomson for permission to 
reproduce certain tables from Sir Godfrey's work. 

This book has benefited greatly from the many constructive criticisms of 
the manuscript given by Julian C. Stanley of the University of Wisconsin, 
Lyle Jones and Stanley A. Mulaik of the University of North Carolina, and 
Philip H. DuBois of Washington University. I regret that limitations of time 
have prevented me from incorporating some of their more insightful recom- 
mendations. 


George A. Ferguson 


BASIC IDEAS IN STATISTICS 


1.1 INTRODUCTION 


This book is concerned with the elementary statistical treatment of experi. 
mental data in psychology, education, and related disciplines. The dat 
resulting from any experiment are usually a collection of observations | 
measurements. The conclusions to be drawn from the experiment cann ot 
be reliably ascertained by simple direct inspection of the data. Classifica- 
tion, summary description, and rules of evidence for the drawing of 
inference are required. Statistics provides the methodology whereby this 
can be done. . шұ 
Implicit in any experiment is the presumption that it is possible to argue 
validly from the particular to the general and that new knowledge can b 
obtained by the process of inductive inference. The statistician does n 
assume that such arguments can be made with certainty. On the contra b 
he assumes that some degree of uncertainty must attach to all such 
arguments; that some of the inferences drawn from the data of experi- 
ments are wrong. He further assumes that the uncertainty itself ida “a 
able to precise and rigorous treatment, that it is possible to make rigorous 
statements about the uncertainty which attaches to any particular infe 
e. Thus in the uncertain milieu of experimentation he applies a rigor- 


enc 
ous method. | 
A knowledge of statistics is an essential part of the training 


students in psychology. There are many reasons for this. First, an u 

standing of the modern literature of psychology requires a knowledge of 

statistical method and modes of thought. A high proportion of currei 

books and journal articles either report experimental findings in statisti 

form or present theories or arguments involving statistical conden 
* " 


за ышы, ы-і 2, “ы. 


1.2 


BASIC STATISTICS 


These concepts play an increasing role in our thinking about psycholog- 
ical problems, quite apart from the treatment of data. The student need 
only consider, for example, the role of statistical concepts in current lines 
of theorizing in the field of learning to grasp the force of this argument. 
Second, training in psychology at an advanced level requires that the 
student himself design and conduct experiments. The design of an experi- 
ment is inseparable from the statistical treatment of the results. Experi- 
ments must be designed to enable the treatment of results in such a way as 
to permit clear interpretation, and to fulfill the purposes which motivated 
the experiment in the first place. If the design of an experiment is faulty, 
no amount of statistical manipulation can lead to the drawing of valid infer- 
ences. Experimental design and statistical procedures are two sides of the 
same coin. Thus not only must the advanced student conduct experi- 
ments and interpret results, he must plan his experiments in such a way 
that the interpretation of results can conform to known rules of scientific 
evidence. Third, training in statistics is training in scientific method. Sta- 
tistical inference is scientific inference, which in turn is inductive infer- 
ence, the making of general statements from the study of particular cases. 
These terms are for all practical purposes, and at a certain level of gener- 
ality, synonymous. Statistics attempts to make induction rigorous. Induc- 
tion is regarded by some scholars as the only way in which new knowledge 
comes into the world. While this statement is debatable, the role in 
modern society of scientific discovery through induction is obviously of the 
greatest importance. For this reason no serious student of psychology, or 
any other discipline, can afford not to know something of the rudiments of 
the scientific approach to problems. Statistical procedures and ideas play 
an important role in this approach. 


THE BROAD ROLE OF QUANTIFICATION IN PSYCHOLOGY 


While this book is largely concerned with elementary statistical proce- 
dures and ideas, some mention may be made of the broad role of quantita- 
tive method in psychology. 

The attempt to quantify has a long and distinguished history in experi- 
mental psychology, which indeed may be regarded as synonymous with the 
history of that science itself. Since the experimental work in psychophysics 
of E. Н. Weber and Gustav Fechner in the nineteenth century, determined 
attempts have been made to develop psychology as an experimental 
science. The early psychophysicists were concerned with the relationship 
between the “mind” and the "body" and developed certain mathematical 
functions which they held to be descriptive of that relationship. While 
much of their thinking on the mind-body problem has been discarded, their 
methods and techniques with development and elaboration are still used. 
Shorn of its philosophical and theoretical encumbrances, the work of the 
early psychophysicists was reduced in effect to the study of the rela- 
tionship between measurements, obtained in two different ways, of what 


bs 


BASIC IDEAS IN STATISTICS 5 | А 


were presumed to be the same property. Thus, for example, they studied | 
the relationship between weight, length, and temperature, defined by the ы 
responses of human subjects as instruments, and weight, length, and tem-_ Г 
perature, defined by other measuring instruments, scales, foot rules, and 4 
thermometers. А psychophysical law, so called, is a statement of the rela- - | 
tionship between measurements obtained by these two methods. Modern | 
psychophysics is concerned to some considerable extent with the scaling of - | 
the responses of the human subject as instrument and with the use of the. _ 
human subject as instrument in dealing with a wide variety of practical | 
problems. It may perhaps be referred to as human instrumentation. NI 
The early psychophysicists invented certain experimental methods and | 
developed statistical procedures for handling the data obtained by these Ч 
methods. It is of interest to note that one method, the constant process, | 
developed by G. E. Miiller and F. M. Urban, has, with modification, found . 
application in bidlogical-assay work in assessing the potency of hormones, 
toxicants, and drugs of all types. It is currently known in biology as the yr 
method of probits (Finney, 1944, 1947). 2 
Statistical methods have found extensive application in the psycholog- - 
ical and educational testing field and in the study of human ability. Since | 
the time of Binet, who developed the first extensively used test of intel- - 
ligence and whose thinking was influenced by the early psychophysicists, а | 
comprehensive body of theory and technique has been developed which is | 
primarily statistical in type. This body of theory and technique is con- - 
cerned with the construction of instruments for measuring human ability, | 
personality characteristics, attitudes, interests, and many other aspects of | 
behavior; with the nature and magnitude of the errors involved in such * 
measurement; with the logical conditions which such measuring instru- 
ments must satisfy; with the quantitative prediction of human behavior; | 
and with other related topics. Some of these topics are discussed in Part IV. 
of this book. i NS 
The use of psychological tests stimulated the development of the tech- 
niques of factor analysis, which are used to some extent in contemporary 
psychology. Problems arise which involve a study of the relationships | 
between sets of variables, sometimes as many as 50 or 60 and perhaps 
more. Factor analysis, by identifying underlying variables, attempts to | 
provide a simplified description of these relationships, which facilitates an. 
interpretation and comprehension of the information in the data. Factor 
analysis has found a number of uses in branches of science other than psy- | 
chology, including meteorology and agriculture. An elementary introduc- | 
tion to factor analysis is given in Chap. 27 of this book. 3 
Within recent years frequent use has been made of statistical concepts 
in the construction of models designed to provide some explanation and 
understanding of observable phenomena. Such models are used in the field 
of learning. Further. many biological scientists are currently concerned. 
with the construction of models which may possibly bear some corre- 
spondence to the functioning of certain aspects of the central nervous. 
system. While these attempts may be premature and their Success cannot 


1.3 


BASIC STATISTICS 


at this time be evaluated, it is possible that in future the models which will 
prove helpful in understanding the functioning of the human brain will 
either implicitly or explicitly involve statistical concepts. In a system com- 
prised of a complex network of nerve fibers, the transmission of impulses 
can be conceived in probabilistic terms. 

The study of the topics discussed above demands a knowledge of statis- 
tical method and a comprehension of the basic ideas of statistics as a 
starting point. It would seem that as psychology develops, increasing 
emphasis will be placed on quantitative procedure and an increasing 
degree of statistical sophistication will be required of the student. 


STATISTICS AS THE STUDY OF POPULATIONS 


Statistics is a branch of scientific methodology. It deals with the collection, 
classification, description, and interpretation of data obtained by the 
conduct of surveys and experiments. Its essential purpose is to describe 
and draw inferences about the numerical properties of populations. The 
terms, population and numerical property, require clarification. 

In everyday language the term population is used to refer to groups or 
aggregates of people. We speak, for example, of the population of the 
United States, or of the state of Texas, or of the city of New York, meaning 
by this all the people who occupy defined geographical regions at specified 
times. This, however, is a particular usage of the term population. The stat- 
istician employs the term in a more general sense to refer not only to 
defined groups or aggregates of people, but also to defined groups or 
aggregates of animals, objects, materials, measurements, or “things” or 
“happenings” of any kind. Thus the statistician may define, for his partic- 
ular purposes, populations of laboratory animals, trees, nerve fibers, 
liquids, soil, manufactured articles, automobile accidents, microorga- 
nisms, birds’ eggs, insects, or fishes in the sea. On occasion he may deal 
with a population of measurements, By this is meant an indefinitely large 
aggregate of measurements which, hypothetically, might be obtained under 
specified experimental conditions. To illustrate, a series of measurements 
might be made of the length of a desk. Some or all of these measurements 
may differ one from another because of the presence of errors of measure- 
ment. This series of measurements may be regarded as part of an indefi- 
nitely large aggregate or population of measurements which might, hypothet- 
ically, be obtained by measuring the length of t 
an indefinitely large number of times. 

The general concept implicit in all these particular uses of the word pop- 
ulation is that of group or aggregation. The statistician’s concern is with 
properties which are descriptive of the group or aggregation itself rather 
than with properties of particular members. Thus measurements may be 
made of the height and weight of a group of individuals. These measure- 
ments may be added together and divided by the number of cases to obtain 

the mean height and weight. These means describe a property of the group 


he desk over and over again 


BASIC IDEAS IN STATISTICS 7 


as a whole and аге not descriptive of particular individuals. To illustrate | 
further, a child may have an IQ of 90 and belong to a high socioeconomic | 
group. Another child may have ап IQ of 120 and belong to a low | 
socioeconomic group. These facts as such about individual children do not р 
directly concern te statistician. If, however, questions are raised about | 
the proportion of children in a particular population or subpopulation with — 
10% above or below a specified value, or if more general questions аге | 
raised about the relationship between intelligence and socioeconomic 
level, then these are questions of a statistical nature, and the statistician | 
has techniques which assist their exploration. >. 
The distinction is sometimes made between finite and infinite popula- | 
tions. The children attending school in the city of Chicago, the inmates of | 
penitentiaries in Ontario, the cards in a deck are examples of finite popula- 7 
tions. The members of such a population can presumably be counted, and | 
a finite number obtained. The possible rolls of a die and the possible obser- . 
vations in many scientific experiments are examples of infinite or indefi- . 
nitely large populations. The number of rolls of a die or the number of sci- ( 
entific observations may, at least theoretically, be increased without апу | 
finite limit. In many situations the populations which the statistician > 
proposes to describe are finite, but so large that for all practical purposes | 
they may be regarded as infinite. Тһе 200 million or so people living in the 
United States constitute a large but finite population. This population is so | 
large that for many types of statistical inference it may be assumed to Ве | 
infinite. This would not apply to the cards in a deck, which may be thought | 
of as а small finite population ef 52 members. . 
Most populations are comprised of naturally distinguishable members, | 
as is, of course, the case with people, animals, measurements, or the rolls | 
of a die. Some populations are not so comprised, as is the case with liquids, k 
soils, woven fabrics, or, for that matter, human behavior. How is it possible , 
to apply the concept of group or aggregation to populations of this latter | 
type? This may be done by defining the population member arbitrarily as a | 
liter, a cubic centimeter, a square yard, or some such unit. The whole pop- | 
ulation may be thought to be composed of an aggregate of such members. | 
Likewise, in the study of human behavior, the psychologist frequently con- - 
cerns himself with arbitrarily defined bits of behavior, although behavior ав | 
such may perhaps be regarded as a continuous flow or sequence. y 
Statistics is concerned with the numerical properties of populations, that | 
is, with properties to which numerals can in some manner be assigned. The ` 
logical implications of the term numerical property are complex and need 
not be elaborated here. To illustrate briefly, however, in any population of · 
mental-hospital patients some may be classed as psychoneurotic, others as 
schizophrenic psychotic, others as psychotic with organic brain disease, | 
and so on. Further, some patients may come from broken homes, while | 
others may have a normal healthy home background. Some may have а 
history of mental disease in the family, and others may not. We may be said 
to apply a statistical method when we concern ourselves with how man 
patients in the population fall within these various classes, that is, helt 


1.4 


BASIC STATISTICS 


many are psychoneurotic, schizophrenic psychotic, and the like, and how 
many come from broken homes, how many do not, and so on. Further, the 
flicker-fusion rates of some part or all of the population may be measured 
and attention directed to the numbers of patients who fall within specified 
ranges of flicker-fusion rate, to mean rates for various classes of patients, 
and to related problems. The investigation of such problems as these may 
be said to involve a statistical method. In general, the statistician’s concern 


is with those properties of populations which can be expressed in 
numerical form. 


STATISTICS AS THE STUDY OF VARIATION 


Statistics is sometimes conceptualized as the study of variation, because it 
provides a technology for the exploration of variation in the events of 
nature and for the making of inferences about the causal circumstances 
which underlie that variation. Emphasis on the study of variation origi- 
nated with Darwin in The Origin of Species (1859). Variation was a central 
concept in the theory of natural selection because evolution could not 
occur without it. In Darwin’s words, 


The many slight differences which appear in the offspring from the same parent . . . may be 
called individual differences. . . . These individual differences are of the highest importance 
for us, for they are often inherited, as must be familiar to everyone; and they thus afford mate- 
rials for natural selection to act on and accumulate. 


The matter is more clearly stated in an editorial in the first issue of the 
journal Biometrika (1901), probably written by Karl Pearson: 


The starting point of Darwin's theory of evolution is precisely the 


existence of those dif- 
ferences between individual members of a race or species which moi 


rphologists for the most 
part rightly neglect. The first condition necessary, in order that any process of Natural Selec- 


tion may begin among a race, or species, is the existence of differences among its members; 
and the first step in an enquiry into the possible effect of a selective process upon any charac- 
ter of a race must be an estimate of the frequency with which individuals, exhibiting any given 
degree of abnormality with respect to that character, occur, The unit, with which such an 
enquiry must deal, is not an individual but a race, ога statistically representative sample of a 
race; and the result must take the form of a numerical statement, showing the relative 
frequency with which the various kinds of individuals composing the race occur. 


Darwin made no direct contribution to stat 
however, create a theoretical context, 
which made the study of variation mean 
development of statistical methods for it 
Galton understood fully the concept of 
the initial applications of the so-called “ 
psychological enquiry and made import 
ment of methods of correlation. He greatl 
ciple and biographer. Pearson perceived 
a mathematical basis for evolutionary the 


istical method. He did, 
based on observation and report, 
ingful and required, as it were, the 
s rigorous study. Darwin's disciple 
variation. He was responsible for 
normal" curve, or distribution, in 
ant contributions to the develop- 
y influenced Karl Pearson, his dis- 
his role as that of helping to build 
ory; he did, in fact, build the foun- 


BASIC IDEAS IN STATISTICS 9 


dations of modern statistics. Between 1894 and 1916 he published 19. 
papers and monographs on statistical subjects, some of great importance | 
and comprehensiveness. All these papers were titled Contributions to the 
Mathematical Theory of Evolution. p. 

Despite influences from many sources, modern statistics is largely a A 
direct emergent of the biological revolution of the nineteenth century » 
which Darwin helped to create. The central concept in evolutionary theory. 1 
which gave riseto thisline of development wasthe conceptofvariation. Since | 
Pearson the development of statistical method has been closely associated 
with the attempt to find solutions to biological problems. В. A. Fisher, who | 
developed the analysis of variance and made contributions between 1920 and 7 
1960 which exceeded those of any other living person, devoted his life | 
primarily to statistical problems of experimentation in the biological sciences | 
and to the mathematical foundations of genetics. 


ae >< 


S^ LES AND SAMPLING < 
Because of the large size of many populations, it may be either impracti- 
cable or impossible for the investigator to produce statistics based оп all | 
members. If, for example, interest is in investigating the attitudes of adult | 
ard immigrants, it would obviously be a prohibitively expen 
sive and time-consuming task to measure the attitudes of all adult Canadi- 
ans and produce statistics based on a study of the complete population. Ifa — 
population is indefinitely large, it is of course impossible, ipso facto, to ` 
ргодисе complete population statistics. Under circumstances such as | 
these the investigator draws what is spoken of as a sample. A sample is any | 
subgroup or subaggregate drawn by some appropriate method from a popu- 5. 
lation, the method used in drawing the sample being important. Methods 
used in drawing samples will be discussed in later chapters of this book. | 
Having drawn his sample, the investigator utilizes appropriate statistical 3 
methods to describe its properties. He then proceeds to make statements - 
about the properties of the population from his knowledge of the properties | 
of the sample; that is, he proceeds to generalize from the sample to the | 
return to the example above, an investigator might draw a 
sample of 1,000 adult Canadians, the term adult being assigned a precise 
meaning, measure their attitudes toward immigrants using an acceptable 
technique of measurement, and calculate the required statistics. Questions 
may then be raised about the attitudes of all adult Canadians from the 
information obtained from a study of a sample of 1,000. 

The fact that inferences can be made about the properties of populations, 
from a knowledge of the properties of samples is basic in research think- 
ing. Such statements are of course subject to error. The magnitude of the 
error involved in drawing such inferences can, however, in most cases be 
estimated by appropriate procedures. Where no estimate of error of an 
kind can be made, generalizations about populations from sample data a 


worthless. 


Шет” LIE LLL 


Canadians tow 


population. To 


BASIC STATISTICS 


Information about properties of particular samples, quite apart from any 
generalizations about the population, is of little intrinsic interest in itself. 
Consider a case where the investigator's interest is in the relative effects of 
two types of psychotherapy when applied to patients suffering from a par- 
ticular mental disorder. He may select two samples of patients, apply one 
type of treatment to one sample and the other type of treatment to the other 
sample, and collect data on the relative rates of recovery of patients in the 
two samples. Clearly, in this case his interest is in finding out whether the 
one treatment is better or worse than the other when applied to the whole 
class of patients suffering from the mental disorder in question. He is inter- 
ested in the sample data only in so far as these data enable him to draw 
inferences with some acceptable degree of assurance about this general 
question. His experimental procedures must be designed to enable the 
drawing of such inferences, otherwise the experiment serves no purpose. 
On occasion research reports are found where the investigator states that 
the experimental results obtained should not be generalized beyond the 
particular sample of individuals who participated in the study. The adop- 
tion of this view means that the investigator has missed the essential 
nature of experimentation. Unless the intention is to generalize from a 
sample to a population, unless the procedures used are such as to enable 
such generalizations justifiably to be made, and unless some estimate of 
error can be obtained, the conduct of experiments is without point. 

Statistical procedures used in describing the properties of samples, or of 
populations where complete population data are available, are referred to 
by some writers as descriptive statistics. If we measure the IQ of the 
complete population of students ina particular university and compute the 
mean IQ, that mean is a descriptive statistic because it describes a charac- 
teristic of the complete population. If, on the other hand, we measure the 
ТО of a sample of 100 students and compute the mean IQ for the sample, 
that mean is also a descriptive statistic, because it describes a character- 
istic of that sample. 

Statistical procedures used in the drawing of inferences about the prop- 
erties of populations from sample data are frequently referred to as 
sampling statistics. If, for example, we wish to make a statement about the 
mean IQ in the complete population of students in a particular university 
from a knowledge of the mean computed on the sample of 100 and estimate 
the error involved in this statement, w 
tistics. The application of these proce 
accuracy of the sample mean as an estimate of the population mean; that 
is, it indicates the degree of assurance we may place in the inferences we 
draw from the sample to the population. 

In this section no discussion is advance 
or the conditions which these m 
valid inferences from the sam 
meaning has been assigned to 
orated at a later stage. 


е use procedures from sampling sta- 
dures provides information about the 


d on methods of drawing samples 
ethods must satisfy to allow the drawing of 
ple to the population. Further, no precise 
the term error. These topics will be elab- 


Е 


1.6 


1.7 


BASIC IDEAS IN STATISTICS | 


PARAMETERS AND ESTIMATES 


A clear distinction is usually drawn between parameters and estimates. A 
parameter is a property descriptive of the population. The term estimate 
refers to a property of a sample drawn at random from a population. Th 
sample value is presumed to be an estimate of a corresponding population 
parameter. Suppose, for example, that a sample of 1,000 adult male Cana: 
dians of a given age range is drawn from the total population, the height of 
the members of the sample measured, and a mean value, 68.972 їп. 
obtained. This value is an estimate of the population parameter whi 
would have been obtained had it been possible to measure all the members 
in the population. Usually parameters or population values are unknown. 
We estimate them from our sample values. The distinction betwe: 
parameter and estimate reflects itself in statistical notation. А widely use 
convention in notation is to employ Greek letters to represent parameters 
and Roman letters to represent estimates. Thus the symbol c, the Greek 
letter sigma, may be used to represent the standard deviation in the po 
lation, the standard deviation being a commonly used measure of variabil 
ity. The symbol s may be used as an estimate of the parameter о. This 
convention in notation is applicable only within broad limits. By and large 
we shall adhere to this convention in this book, although in certain 
instances it will be necessary to depart from it. By common practice an 
tradition a Greek letter may be used on occasion to denote a sample sti 
x 


tistic. 


VARIABLES AND THEIR CLASSIFICATION 


The term variable refers to a property whereby the members of a group 
set differ one from another. The members of a group may be individua 
and may be found to differ in sex, age, eye color, intelligence, auditor 


acuity, reaction time to а stimulus, attitudes toward a political issue, and 


many other ways. Such properties are variables. The term constant refers 
to a property whereby the members of а group do not differ one fro 
another. In a sense a constant is a particular type of variable; it is a vari- 
able which does not vary from one member of a group to another or within a 
particular set of defined conditions. ү 
Labels or numerals тау be used to describe the way in which on 
member of a group is the same as or different from another. With variab 
like sex, racial origin, religious affiliation, and occupation, labels ai 
employed to identify the members which fall within particular classes. . 
individual may be classified as male or female; of English, French 
Dutch racial origin; Protestant or Catholic; a shoemaker or a farmer; a 
so on. The label identifies the class to which the individual belongs. Sex for 
most practical purposes is a two-valued variable, individuals being eit! r 
male or female. Occupation, on the other hand, is a multivalued variable 
Any particular individual may be assigned to any one of a large number ‹ 


BASIC STATISTICS 


classes. With variables like height, weight, intelligence, and so on, mea- 
suring operations may be employed which enable the assignment of 
descriptive numerical values. An individual may be 72 in. tall, weigh 190 Ib, 
and have an IQ of 90. 

The particular values of a variable are referred to as variates, or variate 
values. To illustrate, in considering the height of adult males, height is the 
variable, whereas the height of any particular individual is a variate, or 
variate value. 

In dealing with variables which bear a functional relationship one to 
another the distinction may be drawn between dependent and independent 
variables. Consider the expression У = f (X). This expression says that a 
given variable Y is some unspecified function of another variable X. The 
symbol / is used generally to express the fact that a functional rela- 
tionship exists, although the precise nature of the relationship is not stated. 
In any particular case the nature of the relationship may be known; that is. 
we may know precisely what / means. Under these circumstances, for any 
given value of X a corresponding value of Y can be calculated; that is, given 
X and a knowledge of the functional relationship, Y can be predicted. It is 
customary to speak of Y, the predicted variable, as the dependent variable 
because the prediction of it depends on the value of X and the known func- 
tional relationship, whereas X is spoken of as the independent variable. 
Given an expression of a kind Y = X? for any given value of X, an exact 
value of Y can readily be determined. Thus if X is known, Y is also known 
exactly. Many of the functional relationships found in statistics permit 
probabilistic and not exact prediction to occur. Such relationships may 
provide the most probable value of Y for any given value of X, but do not 
permit the making of perfect predictions. 

А distinction may be drawn between continuous and discrete (or discon- 
tinuous) variables. A continuous variable may take any value within a 
defined range of values. The possible values of the variable belong to a con- 
tinuous series. Between any two values of the variable an indefinitely large 
number of in-between values may occur. Height, weight, and chronological 
time are examples of continuous variables. A discontinuous or discrete 
variable can take specific values only. Size of family is a discontinuous vari- 
able. A family may be comprised of 1, 2, 3 or more children, but values 
between these numbers are not possible. The values obtained in rolling a 
die are 1, 2, 3, 4, 5, and 6. Values between these numbers are not possible. 
Although the underlying variable may be continuous, all sets of real data in 
practice are discontinuous or discrete. Convenience and errors of mea- 
surement impose restrictions on the refinement of the measurement 
employed. 

Another classification of variables is possible which is of some impor- 
tance and is of particular interest to statisticians. This classification is 
based on differences in the type of information which different operations 
of classification or measurement yield. To illustrate, consider the following 
situations. An observer using direct inspection may rank order a group of 


BASIC IDEAS IN STATISTICS 13 ) 


individuals from the tallest to the shortest according to height. On the other _ 
hand, he may use a foot rule and record the height of each individual in the 
group in feet and inches. These two operations are clearly different, and | d 
the nature of the information obtained by applying the two operations is dif- 1 
ferent. The former operation permits statements of the kind: individual A. 
is taller or shorter than individual B. The latter operation permits state- 
ments of how much taller or shorter one individual is than another. Dif- ` 
ferences along these lines serve as a basis for a classification of variables, 
the class to which a variable belongs being determined by the nature of the . 
information made available by the measuring operation used to define the 1 
variable. Four broad classes of variables may be identified. These are | 
referred to аз (1) nominal, (2) ordinal, (3) interval, and (4) ratio variables. А. [^ 
very interesting discussion relevant to this topic is given in Torgerson | 
(1958). E 
A nominal variable is a property of the members of a group defined by an. д 
operation which permits the making of statements only of equality or dif- 
ference. Thus we may state that one member is the same as or different | 
from another member with respect to the property in question. Statements | 
about the ordering of members, or the equality of differences between - | 
members, or the number of times a particular member is greater than or E 
м 
, 


less than another are not possible. To illustrate, individuals may be clas- | 
sified by the color of their eyes. Color is a nominal variable. The statement 
that an individual with blue eyes is in some sense "greater than” or "less | 
than" an individual with brown eyes is meaningless. Likewise the state- | 
ment that the difference between blue eyes and brown eyes is equal to the - 
difference between brown eyes and green eyes is meaningless. The only | 
kind of meaningful statement possible with the information available is | 
that the eye color of one individual is the same as or different from the eye | 
color of another. A nominal variable may perhaps be viewed as a primitive 1 
type of variable. and the operations whereby the members of a group are Es 
classified according to such a variable constitute a primitive form of  - 
measurement. In dealing with nominal variables numerals may be assigned | 3 
to represent classes, but such numerals are labels, and the only ригрозе | 
they serve is to identify the members within a given class. | 
Ап ordinal variable is а property defined by an operation which permits = 
the rank ordering of the members of a group; that is, not only are state- 
ments of equality and difference possible, but also statements of the kind 
greater than ог less than. Statements about the equality of differences 
between members or the number of times one member is greater than or 
less than another are not possible. If a judge is required to order a group of 
individuals according to aggressiveness, or cooperativeness, or some other 
quality. the resulting variable is ordinal in type. Many of the variables used © 
in psychology are ordinal. ^ 
An interval variable is a property defined by an operation which permits 
the making of statements of equality of intervals, in addition to statements 
of sameness or difference or greater than or less than. An interval variable 


< 


BASIC STATISTICS 


does not have a “true” zero point, although a zero point may for conve- 
nience be arbitrarily defined. Fahrenheit and centigrade temperature 
measurements constitute interval variables. Consider three objects, A, B, 
апа С, with temperatures 12°, 24°, and 36°, respectively. It is appropriate to 
say that the difference between the temperature of 4 and В is equal to the 
difference in the temperature of B and C. It is appropriate also to say that 
the difference between the temperature of А and С is twice the difference 
between the temperature of A and В ог B and C. It is not appropriate to say 
that B has twice the temperature of A, or that C has three times the temper- 
ature of 4. In common usage, if the temperature yesterday was 64? and 
today it was 32°, we would not say that it was twice as hot yesterday, or that 
the temperature was twice as great, as it was today. Calendar time is also 
an interval variable with an arbitrarily defined zero. 

А ratio variable is a property defined by an operation which permits the 
making of statements of equality of ratios in addition to all other kinds of 
Statements discussed above. This means that one variate value, or mea- 
surement, may be spoken of as double or triple another, and so on. An ab- 
solute zero is always implied. The numbers used represent distances from a 
natural origin. Length, weight, and the numerosity of aggregates are 
examples of ratio variables. One object may be twice as long as another, or 
three times as heavy, or four times as numerous, Many of the variables 
used in the physical sciences are of the ratio type. In psychological work, 
variables which conform to the requirements of ratio variables are 
uncommon. Scales for measuring loudness, pitch, and other variables have 
been developed by Stevens (1957) at Harvard. These appear to satisfy all 
the conditions of ratio variables. 

The essential difference between a ratio and an interval variable is that 
for the former the measurements are made from a true zero point, whereas 
for the latter the measurements are made from an arbitrarily defined zero 
point or origin. Because of this, for a ratio variable ratios may be formed 
directly from the variate values themselves, and meaningfully interpreted. 
For an interval variable, ratios may be formed from differences between 
the variate values. The differences constitute a ratio variable, because the 
process of subtraction eliminates, or cancels out, the arbitrary origin. Dif- 
ferences are the same regardless of the location of the zero or origin. 

А variety of refinements and elaborations can be made on the distinction 
between nominal, ordinal. interval, and ratio variables. For a more detailed 
discussion the reader is referred to Torgerson (1958). 

Some writers distinguish between quantitative and qualitative variables 
without being explicit about the nature of this distinction. In the present 
classificatory system nominal and ordinal variables 
qualitative, and interval and ratio variables as quant 

Statistical methods exist for th 


may be spoken of as 
itative. 


e analysis of data composed of nominal 
variables, ordinal variables, and interval and ratio variables. From the 


viewpoint of practical statistical work in psychology and education the dis- 
tinction between interval and ratio variables is perhaps unimportant, and it 
1s convenient to think of three, and not four, classes of variables, with three 


BASIC IDEAS IN STATISTICS | 


corresponding classes of statistical method. Procedures for the analysis of. 
interval and ratio variables constitute by far the largest, and most impor 
tant, class of statistical method. * 
In practice we frequently apply methods appropriate to one class of v. 
able in the statistical analysis of other classes of variables. This means th 
we either discard information which we do in fact possess or assume that 
we have information which we do not possess. Àn example of the form: 
situation arises where measurements of the interval or ratio type 
replaced by ranks for purposes of analysis. Measurements of heigh 
weight for a group of N subjects may be replaced by the ranks 1, : 
3,...,М, and subsequent analysis based on these ranks. Furth 
measurements may be divided into broad classes, say top third, middl 
third, and bottom third, and treated as a nominal variable. A current 
popular class of statistical method is called nonparametric statistics (s 
Part Ш). Many nonparametric statistical procedures convert proble 
involving interval and ratio variables to problems that involve a consider: 
ation of either nominal categories or ranks. - 
In the analysis of statistical data in psychology and education the inves 
tigator not infrequently assumes that he has information which actually e 
does not have. Variables which are in fact ordinal may be treated by a 
method appropriate for interval and ratio variables. An example of this sit- 
uation arises when the members of a group are ordered with regard to some 
property. The information consists of relations of "greater than" or ‘Че 
than," and these are described by a set of ordinal numbers; thus 
member is first, another second, and so on. It is common practi 
replace such a set of ordinal numbers by the corresponding set of саг inal 
numbers, 1, 2,3, . . . , N, and to proceed to apply arithmetical operatio 
to these numbers. This means that certain assumptions are made. Inform: 
tion is superimposed on the data which the measuring operation did n 
yield; that is, for computational purposes we assume we are in possessi 
of information which actually we do not have. In the above instance we ai 
making an assumption about the equality of intervals when in fact. 
measuring operation employed does not yield information of this kind. ' 
assumption is that the difference between the first and second individ 
equal to the difference between the second and third, and so on. t 
In psychological work many variables are in fact ordinal, although for 
statistical purposes they are, quite justifiably, commonly treated as if th 
were interval or ratio variables. For example, scores on intelligence tes 
scholastic aptitude tests, attitude tests, personality tests, and the like, are 
in effect ordinal variables, although they are commonly treated as if th 
were of the interval or ratio type. No aspect of the operation of measurin 
intelligence, let us say, is such as to permit the making of meaningful stat 
ments about the equality of intervals or ratios. We cannot say that the dif- 
ference in intelligence between a person with an IQ of 80 and one with an ч 
IQ of 90 is in any sense equal to the difference in intelligence between a 
person with an IQ of 110 and one with an IQ of 120. Nor can we meaning 
fully assert that a person with an IQ of 120 is twice as intelligent аз а 


T 


BASIC STATISTICS 


person with an IQ of 60. Such statements are without meaning. Despite 
this, IQ's are commonly treated by statistical methods which, from a 
rigorously logical viewpoint, are appropriate only to interval and ratio vari- 
ables. The suggestion is not made here that the practice of assuming that 
we have information we do not have, or the converse practice of discarding 
information we do in fact have, be discontinued, although a logical purist 
might be led to this position. Frequently practical necessity dictates a par- 
ticular procedure. Nevertheless it is a matter of some importance to know 
the nature of the information contained in the data, We should be able to 
distinguish clearly between this and the information either imposed or dis- 
carded for the purpose of making some process of calculation possible. In 
other words, our understanding of precisely what we are doing is enriched 
by knowing the nature of the assumptions made at each stage in the 
application of any procedure. 


EXPERIMENTAL AND CORRELATIONAL INVESTIGATIONS 


Much scientific enquiry is concerned with an exploration of the relations 
between variables. Some investigations involve a study of the relations 
between many variables; others, of the relation between two variables only, 
and are of the general form У = f (X). To illustrate the two-variable situa- 
tion, Y may be a measure of motor performance; and X, four different 
dosages of alcohol, each dosage administered to 10 different individuals. Y 
may be a measure of intensity of belief in the prevalence of witches; and Y, 
three specified times before, during, and after Halloween. Y may be 
average first-year marks in a university; and X, scores on a scholastic apti- 
tude test. Again Y may be the presence or absence of cancer of the lung at 
age 50; and X, some index or measure of the amount of cigarette smoking. 
In these examples the investigator is concerned with an examination of the 
nature of the relation between two variables. The reader should note here 
that both У and X may be nominal, ordinal, interval, or ratio variables. Thus 
both Y and X may be nominal, Y may be nominal and Y ordinal or interval, 
and so on. . 
А useful and, indeed, important distinction may be made between exper- 
imental and correlational investigations. In an experiment the values of 
the X variable, and the frequency of occurrence of these values, are fixed 
by the investigator. In the illustrative example on the relation between 
motor performance and alcohol, the investigator determines the number of 
dosages, the amount of each dosage, and the number of experimental sub- 
jects receiving each dosage. If the experiment were repeated to check the 
results, the same dosage of alcohol would be used. In a correlational study 
the particular values of the variable, and the frequency of their occurrence, 
are not fixed, or controlled, by the investigator. In the example on the rela- 
tion between first-year averages in university and scholastic aptitude test 
scores, the investigator may draw a sample of students for whom both 
first-year averages and scholastic aptitude test scores are available. He may 
then proceed to study the relation between these two variables, He exerts 


1.9 


тА 

BASIC IDEAS IN STATISTICS 17 
no control over the magnitude of particular scholastic aptitude test scores 
or the frequency of their occurrence. If the investigation were repeated, 
another sample of subjects .would be used, and the particular 
scholastic aptitude test scores, and the frequency of their occurrence, 
might be expected to differ in some degree from that found in the first | 
enquiry. 

In psychology and education use is made of both experimental and corre- | 
lational studies. The experimentalist is himself the creator of variation. 
The correlationist studies the variation which already exists in nature. 
Cronbach (1957) has examined in detail the relation between the experi- .- 
mental and correlational disciplines in psychology. He writes, & 


The well-known virtue of the experimental method is that it brings situational variables under 
tight control. It thus permits rigorous tests of hypotheses and confident statements about — 
causation. The correlational method, for its part, can study what man has not learned to ` т 
control. Nature has been experimenting since the beginning of time, with a boldness and 
complexity far beyond the resources of science. The correlator’s mission is to observe and 
organize the data of nature’s experiments. 


ON CALCULATING 
Skill in the operation of a calculating machine, and, if possible, some Т, 
knowledge of electronic computers, should be acquired at an early stage іп б 
studying statistics. Many of the statistical problems which presenf them- 
selves in experimental work in psychology and education involve much | 
computation, and without a calculator the arithmetical labor is prohibitive. 
Indeed, many forms of modern statistical analysis are so complex that their 
application is simply not possible without access to a high-speed electronic 
computer. Skill in the operation of a calculator can be readily acquired, a 
reasonable level of performance on simple operations being attained by | 
most students in a few hours of practice. d 
Statistical tables of various kinds are frequently required in computa- БЫ 
tional work. A useful aid іп computing is Barlow’s Tables. The 1965 рарег- 
back edition of these tables provides the squares, cubes, square roots, 
cube roots. and reciprocals of all integers up to 12,500. The reader may also 
find Basic Mathematical and Statistical Tables for Psychology and Educa- 
tion (Meredith, 1967) a convenient source of information on many 
numerical functions and commonly used probability distributions. Statis- 5. 
tical Tables for Biological, Agricultural and Medical Research, by Fisher Ы 
and Yates (1963), has been widely used by statisticians for many years and | І 
currently is available іп its sixth edition. 
In computing, the importance of adequate checks on the accuracy of the A 
calculation cannot be too emphatically stressed. Every calculation should ` Y 
be checked by the employment of some checking device which guarantees | 
accuracy. There is no substitute for accuracy. The conduct of an expel 3 | 
ment serves no purpose unless correct inferences аге drawn from the дата, 
The correctness of the inferences drawn cannot be assured unless the sta- 


BASIC STATISTICS 


tistical procedures employed are appropriate to the data and unless these 
procedures are accurately applied. Students not infrequently feel that the 
statistical analysis of a set of data is laborious and time-consuming, and in 
their haste to arrive at some kind of result they may disregard checks 
which are necessary to ensure the accuracy of their calculations. When 
tempted in this direction, the student should observe that the time spent in 
the proper statistical analysis of a set of data represents in most instances a 
small proportion of the time required to plan the experiment and gather the 
data. A slipshod analysis may throw in jeopardy the total investment of 
time and effort. 

The availability of electronic computers has changed many aspects cf 
modern statistical work. Many computational procedures which have been 
devised to simplify arithmetical calculation are now obsolete. Forms of 
calculation which were hitherto impossible can now be readily carried out 
in short periods of time. Many investigators now conduct elaborate analy- 
ses of such complexity that their proper interpretation demands much in 
the way of statistical sophistication. Not uncommonly investigators are 
bewildered by the deluge of calculations which computers quickly regurgi- 
tate. The computer has clearly not only enhanced the importance of an 
understanding of basic statistical ideas, as distinct from computational 
methods, but has also reemphasized the importance of simple ways of 
viewing complex problems. Also, the computer has led to the increased use 
of multivariate statistical methods and has stimulated the development of 


such methods. These are methods for the treatment of data comprised of 
many variables. 


UNITS OF MEASUREMENT 


When dealing with continuous variables a unit of measurement may be 
regarded as any defined subdivision of a scale, however fine. In measuring 
length the units may be inches, yards, and miles or centimeters, meters, 
and kilometers. In measuring weight the units may be ounces and pounds 
or grams and kilograms. In measuring chronological time the units may be 
seconds, minutes, hours, days, months, or years. 

With continuous variables, although all values are theoretically possible 
within any range of values, we select a unit of measurement and record our 
observations as discrete values. All experimental observations, 
obtained, are recorded as discrete values. Thus the length of a de. 
height of a man may be measured to the nearest inch, or tenth of an inch, or 
hundredth of an inch, the unit of measurement in each case being 1 in., 15 
in., or т in., respectively, and the number of such units involv. 
particular measurement must, of necessity, 
number. 

The fineness of the unit of measurement 
accuracy which the nature of the situatio: 
which the instrument of measurement allo 
of time intervals, 


however 
sk or the 


ed in any 
be recorded as a discrete 


employed is determined by the 
n demands or by the accuracy 


ws, or both. In the measurement 
for example, great accuracy can be obtained by the use of 


al 


BASIC IDEAS IN STATISTICS it 


appropriate measuring devices. In measuring the time required for a child” 
to solve a problem it is certainly adequate for all practical purposes to. 
record the observation in seconds. In reaction-time experiments, however, 
we may require a unit of measurement of a hundredth or perhaps a 
thousandth part of a second. Further, the unit should reflect the accuracy і 
of the measuring operation. To illustrate, an intelligence quotient is 
calculated by dividing mental age by chronological age, both expressed in 

months, and multiplying by 100. Quite clearly, we could speak of a ъа Я 
intelligence quotient as being 103.3, ог 103.23, or something of the sort. 
Such an attempt at accuracy would be spurious because of the large error 

of measurement which is known to attach to the intelligence quotient. In 


practice, intelligence quotients are always recorded to the nearest whole 


number. 
When we record measurements of a continuous variable as discrete 


numbers in so many units, we imply in most cases that had a more accurate 
form of measurement been used, were this possible and desirable, the 
value thereby obtained would fall within certain limits, these limits being 
defined as one-half a unit above and below the value reported. Thus when 
we report a measurement to the nearest inch, say, 26 in., this is assumed to 
mean that the observation falls within the limits 25.5 and 26.5, or more 
precisely that it is greater than or equal to 25.5 and less than 26.5. Like 
wise, a measurement made to the nearest tenth part of an inch, say, 31.718 
assumed to fall within the limits 31.65 and 31.75. In a reaction-time experi- 
ment a particular observation measured to the nearest thousandth of a- 
second might be, say, -196 sec. This means that the measurement is taken 
as falling within the limits .1955 and .1965 sec. 0 

Ап exception to the above is age. When we state that a person is 18 years 
old, we do not mean in conventional usage that his age falls within the 
years 6 months and 18 years 6 months. A person is ordinarily 
18 years old until his 19th birthday. His age is greater than or 
equal to 18 years and less than 19 years. Similarly to state that a person is 
126 months old means that he is greater than or equal to 126 months and 
less than 127 months. Definitions of age other than the above are used for 


limits 17 
spoken of as 


particular purposes. í 
Questions of the above kind do not, of course, arise with discrete vari- 


ables. The number of animals in a cage, or children in a classroom, or teeth 
in a child’s head are discrete observations, and to imply a range of values 
within which any particular observation is assumed to fall is not. 


meaningful. 


SUMMATION NOTATION 

Statistical notation is a language with its own grammatical rules. One of "n 

more frequently used forms of notation is spoken of as summation notation. 

Some familiarity with this class of notation should be acquired as carly il 
a 


possible in the study of statistics. 


20 


[1.1] 


Тһеогет 1 


[1.2] 


Тһеогет 2 


[1.3] 


BASIC STATISTICS 


Let X be a variable and Xy, X, ...,Хуа set of variate values. To illus- 
trate, X might refer to a measure of the activity of rats in a maze. The 
symbols Ху, Xo, . . . , Xy would then refer to measures of activity for indi- 
vidual rats, there being N rats in the group. The sum X, +X, + 
- ++ + Ху, that is, all the individual measures added together, may be 
written as 
N 
> x 
i=1 
Thus 


S Med ed РД 
ісі 


The symbol X is the Greek capital letter sigma and refers to the simple 
operation of adding things up. The symbols above and below the summa- 


N 
tion sign define the limits of the summation. Thus >» means the addition of 
ізі 
all values formed by assigning to i the values of every positive integer from 
i=1toi=N, inclusive. For example, let the numbers 10, 12, 19, 21, 32 be 
measures of activity for a group of five rats. The sum of these five scores 


may be represented symbolically by > X; and in this case is equal to 94. 
ici 
Where the limits of the summation are clearly understood from the 
context, which is very frequently the case, it is customary to omit the nota- 
tion above and below the summation sign and write EX, or simply УХ. 
There are a number of very simple theorems which are useful in han- 
dling problems involving summation notation. 


И every variate value in a group is multiplied by a constant number or 
factor, that factor may be removed from under the summation sign and 
written outside as a factor. Thus 


N 
У сд = cX, +X, +... + cXy 


ізі 


= (++ +Xy) 


This means that if we multiply each one of the measures 10, 12, 19, 21, 32 


by any constant, say, 5, the sum of the resulting measures will be given 
directly by 5 x 94. 


The summation of a constant over N terms is equal to Ne. Thus 
М 
У с=с+с+ С 
і=1 
= № 


Шс-5 and N equals 4, it is obvious that 5+5 + 5+5= 4х 5 = 90. 


p 


We Zea Ве ҮЙ eee = AF 


Date.... al. Эа адай 


D © BASIC IDEAS IN STATISTICS A 


Асс, МӘ; СНЕ 
Theorem 3 Th ination vf the suni of any’ Humber of terms is the sum of the sum- 
mations of these terms taken separately. Thus 


N 
па Y QG Y AZ) -XoctYstZotXstYsk 2+ ++ +Xy+Yy+Zy ` 


ға 
N N N 
=Ух+Уу YotY Zi 
ізі ізі ісі 
Theorem 4 Тһе sum of the first N integers is 


N(N +1) 
2 


Consider the integers 1, 2, 3, . . . , (N — 2), (N — 1), М. It is observed 
that the sum of the first and last integers in the series is equal to N + 1, the 
sum of the second and the second from the last is equal to N + 1, and so 
on. In any series where N is even, there are №/2 such pairs, and the sum of 
the series is given by N(N + 1)/2. Where N is odd, there аге (N — 1)/2 
И such pairs, plus the middle term, which is equal to (V + 1)/2. The sum of 
e jl 


[1.5] 


;| the series is then 


> / 
У, Calcutta ò; wo) (N+1) + 


w 8 p. yo, 
NS 


(N+1) _ N(N+1) 
aes 2 


" 

Ап expression frequently encountered in statistics is > XjY,. This refers 
ізі 

to the sum of the products of two sets of paired numbers. If, for example, 5, 


А 


, 


2 


6, 12, 15 are the scores, X, of four people on a test, andi 2, 3,7, 10 are the 


scores, Y, of the same four people on another test, then >; X.Y; refers to the 


ізі 
sum of products and is equal to 5 X 2 +6 X 3 +12 X 7 + 15 10, or 262. 
The following are examples which illustrate the application of the abosi 


theorems. 


N N N 
Example 1 Ў (Х+с) = Y х+Ус= У X + Nc 
ігі іі i 


] N N N N N 
каме $a-Yeo-XX4X + yond rtd Ү+ № 
= ігі іі = = i=1 


N 


Example 3 P пао EOP + BAH оу Bex + а 


x 
=> x P LIS 


" N 
Example 4 [prec qom 2XY +5 y: 
юз, ia 


Example 5 


Example 6 


Example 7 


Example 8 


EXERCISES 


BASIC STATISTICS 


$ +] ў Dee zxee-e]p 


E x N м 
=D X+ их E Fy X 


T 
т 
i 

т 


Š С +0) -=S te ve oxy) — (X + Y? — 2xy)] 


=> UC Y! - 2XY — X — Y? + oxy] 
ігі 
N N 

= r= 45 лү 


N 


N N N d 
= зуу ҳ 2 dc = ‚ — У GAF 
> (0-30) = у (+9: — бху) = у X+ o 9 x 


іш ігі = i 


ігі 
N N N 
=Ужч+ох Y:-6$ xy 
ізі 


ізі ізі 


Let X denote the integers 1,2,3,...,М. 


У (+) = 5 X Ncc UNED 4 ye 


ізі ізі 


manipulation сап be acquired with a little Practice. A goo 


statistics. 


1 Indicate with examples the differences between (a) Population and 
sample, (6) finite and infinite populations, (с) descriptive and sampling 
statistics, (4) parameters and estimates, (e) dependent and indepen- 


dent variables, (/) continuous and discrete variables, (g) experiments 
and correlational studies, 


2 Classify the following as nominal, ordinal, interval, or ratio variables: 
(a) height, (5) weight, (c) examination marks, (d) sex, (e) eye color, (f) 
calendar time, (е) age, (h) racial origin, (i) temperature, (0 Tatings of 
scholastic success. 

3 Write the following in summa: 
a XXe TX 
b ИИ... + Ух 


tion notation: 


К 


BASIC IDEAS IN STATISTICS 2; 


(X, Yi) + (Х, + Y) + ccc (X: +Y,) 
ХУ, + Хоу, + + + + +Xy¥y 

AQ, ХҮ, + 5 XY. 

(X, + с) + (Же) ++ > + + (+ с) 
cX, + СХ, + + + + 0X05 

Xet Xde + + + + +Хыс 
ХҮ, + сҮ, + + + +сХМУ, 


Write each of the following in full: 


e X OY) t IEX Y, 


ізі ізі 


Show that 
$ Gere Y Х2+ 5 X,+ Nc 


ізі ізі 


Consider the following paired observations: 


f У>(Х,—5)(Ү,—4) 
в (5X; — 4Y;)? 
У(Х, —5)* h Х(Х,-3%: 
>(Ү,—4)* i E(XJY) 
IXY: 


In all instances the summation is understood to extend over the fi 
paired observations. L 


Which of the шуша are true and which are false? 


a SaS Vim $ жи 


ici ici =1 


ь (Zaz 


ізі 
e +) —) -ir- = м2 
ізі 
N 
d Y Qu eve Y xe Y? «2 xy, 
ізі 
What is the sum of the first 100 integers? 


B 
BASIC STATISTICS 
9 If X, = 4, Х, = 6, X, = 3, and X, = 7, obtain the following: 
4 4 
а У (42-4) b У (X8-X2+X,) 


10 Obtain 


a 56 


2.1 


2.2 


FREQUENCY DISTRIBUTIONS 


AND THEIR 
GRAPHIC REPRESENTATION 


INTRODUCTION 

The data obtained from the conduct of experiments or correlational studies 
are frequently collections of numbers. Classification and description of 
these numbers are required to assist interpretation. Under certain circum- . 
stances advantages attach to the classification of the data in the form of 
frequency distributions. Such classification may help the investigator to 
understand important features of the data. A frequency distribution is an 
arrangement of the data that shows the frequency of occurrence of the dif- 
ferent values of the variable or the frequency of occurrence of values 
falling within arbitrarily defined intervals of the variable. The meaning of 
this latter statement will become clear as we proceed. This chapter dis- 
cusses the organization of data in the form of frequency distributions, and 
the graphic representation of frequency distributions. Chaps. 3 and 4 to 
follow, discuss the descriptive numerical properties of frequency distribu- 


tions, or the properties of the collections of numbers of which these dis- | 


tributions are comprised. 


CLASSIFICATION OF DATA 

Consider the data in Table 2.1. These are the intelligence quotients of 100 

children obtained from a psychological test. As a first step in the directio 

of classification we may rank order the 100 intelligence quotients in ord n 
er 


of magnitude, proceeding from the largest to the smallest as shown i 
wn in 


25 


26 


BASIC STATISTICS 


Table 2.1* Table 2.2 
Intelligence quotients made by Rank distribution of intelligence 
100 pupils on a mental test quotients shown in Table 2.1 
109 111 82 105 134 134 109 102 93 82 
113 90 79 10 117 127 109 101 92 82 
80 90 121 75 93 122 108 101 92 82 
99 90 92 96 82 121 108 100 91 82 
101 104 80 81 83 121 108 100 91 81 
104 93 109 72 110 119 107 100 91 81 
111 91 10 1n 81 119 107 99 90 81 
122 83 92 101 77 117 106 99 90 80 
99 103 93 91 67 117 105 99 90 80 
108 93 84 88 100 117 105 98 90 80 
102 84 96 89 81 116 104 96 89 79 
107 95 91 107 102 114 104 96 89 79 
109 93 82 103 116 113 104 95 89 78 
86 — 78 73 104 10 ш 104 о вт т 


103 108 76 94 108 
72 87 121 80 127 
105 103 106 119 90 
93 89 110 103 100 
99 79 17 114 117 
93 82 98 89 119 


111 103 93 86 76 
111 103 9з 84 75 


—€——M— MÀ — M — 
* Tables 2.1 to 2.6 are reproduced from R. W. 
B. Jackson and George A. Ferguson, Manual 
of educational statistics, University of To- 


ronto, Department of Educational Research, 
Toronto, 1942. 


Table 2.2. An arrangement of this kind is called a rank 
an arrangement of data has fe 
however, shows that many sco: 
103's, three 100’s, and so on. Thi 
in columns, as shown in Table 2.3 


» and the number of times a 
represented by the symbol f. 
n as many classes as there are 
able. The number of classes is 
umber of classes by arranging 
f the variable; thus all scores 
all scores with the values 65, 66, 67, 68, 


particular score value occurs is а frequency, 

In Table 2.3 the data have been classified i 
score values within the total range of the vari 
large. Usually it is advisable to reduce the n 
the data in arbitrarily defined groupings o 
within the range 65 to 69, that is, 


ж. 
FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 27 


Table 2.3 Frequency distribution of intelligence quotients of Table 2.1 


with as many classes as score values 


3% 


| 5соге f Score Ff Score Score f 
| m 
134 1 117 3 100 3 83 2 
| Be = We. 1 9 3 82 4 
132 — 15 — 98 1 81 3 
130 — 14 1 97 — 80 3 
| 130 — 113 1 96 2 79 2 
129 — 12 — 95 1 78 1 
| 128 = ni 3 94 1 77 1 
127 1 110 2 93 1 76 1 
126 — 109 4 92 2 15 1 
125 - 108 3 91 3 74 = 
| 14 == 107 2 90 4 73 1 
| 123 = 106 1 89 3 72 2 
122 1 105 2 88 — n = 
121 2 104 4 87 1 70 = 
120 — 103 5 86 1 69 = 
119 2 102 2 Ваа = 68 = 
lia - 101 2 84 3 67 1 


and 69, may be grouped together. All scores within the ranges 70 to 74, 75 
to 79, and so on, may be similarly grouped. Such groupings of data are 
usually done by entering а tally mark for each score opposite the range of - 
hich it falls and counting these tally marks to obtain 
thin the range. This procedure is shown їп 


the variable within w 
the number of cases м1 


Table 2.4. 1 
The range of the variable adopted is called the class interval. In the illus- 


tration in Table 2.4 the class interval is 5. This arrangement of data is also а. 
frequency distribution, and the number of cases falling within each class | 
interval is a frequency. The only difference between Tables 2.3 and 2.4 is in 
the class interval, which is 1 in the former case and 5 in the latter. 

P 


2.3 CONVENTIONS REGARDING CLASS INTERVALS i 


In the arrangement of data with a class interval of 1, as shown in Table 2.3, 
the original observations are retained and may be reconstructed directl; 

from the frequency distribution without loss of information. If the class. 
interval is greater than 1, say, 2, 5, or 10, some loss of information 
regarding individual observations is incurred; that is, the original observa- 
tions cannot be reproduced exactly from the frequency distribution. If the 
class interval is large in relation to the total range of the set of observations, 
this loss of information may be appreciable. If the class interval is email 
the classification of data in the form of a frequency distribution may "€ 


| 


ў 


Е _ 


28 


Table 2.4 


BASIC STATISTICS 


Frequency distribution of the intelligence 
quotients of Table 2.1 


бан Tally Frequency 
130-134 | 1 
125-129 | 1 
120-124 Ill 3 
115-119 HH | 6 
110-114 HA gu 7 
105-109 о и 12 
100-100 И f mil 16 
95-99 II 1 
90-94 TH THA NU 11 17 
85-89 T 5 
80-84 ЮМ и Nu 15 
75-79 т | 6 
70-74 ІІІ 3 
65-69 / fi 
Total 100 


to very little gain in conv. 
observations. 

The rules listed below are w 
These rules lead in most ca 


enience over the utilization of the original 


idely used in the selection of class intervals. 
Ses to a convenient handling of the data. 
1 Select a class interval of such a size that between 10 and 20 such 
ge of the observations. For example, if 


2 Select class intervals wit 


ha range of 1, 2, 3, 4,5, 10. 
will meet the requireme 


nts of most sets of data. 
3 Start the class interval ata vi 
interval. For example, with 


» or 20 points, These 


values 2, 4, 6, 8, 10, etc. This is, of 
course, highly arbitrary, 


4 Arrange the class intervals accordin 
observations they include, the cl 
observations being placed at the t 


£ to the order of magnitude of the 


ass interval containing the largest 
op. 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 29 


2.4 


2.5 


EXACT LIMITS OF THE CLASS INTERVAL 


Where the variable under consideration is continuous, and not discrete, we 
select a unit of measurement and record our observations as discrete 
values. When we record an observation in discrete form and the variable is 
a continuous one, we imply that the value recorded represents a value 
falling within certain limits. These limits are usually taken as one-half a 
unit above and below the уаше reported. Thus when we report a measure- 
ment to the nearest inch, say, 16 in., we mean that, if a more accurate form 
of measurement had been used, the value obtained would fall within the 
limits 15.5 and 16.5 in. Similarly, a measurement made to the nearest tenth 
part of an inch, say, 31.7 in., is understood to fall within the limits 31.65 and 
31.75 in. In a reaction-time experiment a particular observation measured 
to the nearest thousandth of a second might be, say, .196 sec. This assumes 
that had a more accurate timing device been used, the measurement would 
have been found to fall somewhere within the limits .1955 and .1965 sec. 

Class intervals are usually recorded to the nearest unit and thereby 
reflect the accuracy of measurement. For various reasons it is frequently 
necessary to think in terms of so-called exact limits of the class interval. 
These are sometimes spoken of as class boundaries, or end values, and 
sometimes as real limits. Consider the class interval 95 to 99 in Table 2.4. 
We grouped within this interval all measurements taking the values 95, 96, 
97, 98, and 99. The limits of the lower value are 94.5 and 95.5, while those 
of the upper value are 98.5 and 99.5. The total range, or exact limits, which 
the interval is presumed to cover is then clearly 94.5 and 99.5, which means 
all values greater than or equal to 94.5 and less than 99.5. 

The above discussion is applicable to continuous variables only. With 
discrete variables no distinction need be made between the class interval 
and the exact limits of the interval, the two being identical. 4 

Table 2.5 shows the frequency distribution of the intelligence quotients 
of Table 2.1. Column 1 shows the class interval as usually written, while 
col. 2 records the exact limits. In practice, of course, the exact limits are 


rarely recorded as in Table 2.5. 


DISTRIBUTION OF OBSERVATIONS 
WITHIN THE CLASS INTERVAL 


The grouping of data in class intervals results in a loss of information 
regarding the individual observations themselves. Scores may differ one 
from another within a limited range. and yet all be grouped within the same 
interval. In the calculation of certain statistics and in the preparation of 
graphs it becomes necessary to make certain assumptions regarding the 
values within the intervals. Two separate assumptions may be made, 
depending on the purposes we have in mind. 

The first assumption states that the observations are uniformly distrib- 
uted over the exact limits of the interval. This assumption is made in the 
calculation of such statistics as the median, quartiles, and percentiles and 
in the drawing of histograms. In Table 2.5 it will be observed that 16 cases 


BASIC STATISTICS 


Class intervals, exact limits, and mid-points for frequency 


distribution of intelligence quotients 


1 2 3 Я 4 
Mid-point of 
EE ai Exact limits interval Frequency 

130-134 129.5-134.5 132.0 1 
125-129 124.5-129.5 127.0 1 
120-124 119.5-124.5 122.0 3 
115-119 114.5-119.5 117.0 6 
110-114 109.5-114.5 112.0 E 
105-109 104.5-109.5 107.0 12 
100-104 99.5-104.5 102.0 16 
95-99 94.5-99.5 97.0 Т 
90-94 89.5-94.5 92.0 17 
85-89 84.5-89.5 87.0 5 
80-84 79.5-84.5 82.0 15 
15-79 74.5-79.5 77.0 6 
70-74 69.5-74.5 72.0 3 
65-69 64.5-69.5 61.0 i 
Total 100 


А а. 


fall within the interval 100 to 104, which has the exact limits 99.5 to 104.5, 
The assumption states that these 16 cases are distributed over the interval 
as follows: 


Interval Frequency 
— ————— 
103.5-104.5 3.2 
102.5-103.5 3.2 
101.5-102.5 3.2 
100.5-101.5 8.2 

99.5-100.5 282 
Total 16.0 
$— 


The second widely used assumption states that all the observations are 
concentrated at the mid-point of the interval, that is, that all the observa- 
tions for that interval are the same and equal to the value corresponding to 
the mid-point of the interval. The mid-point of any class interval is halfway 
between the exact limits of the interval. In the above example the 
mid-point of the interval 99.5 to 104.5 is 102. This second assumption is 
ordinarily made in the calculation of such statistics as means, standard 
deviations, and in the drawing of frequency polygons, 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 31 


The determination of the mid-point of a class interval should present no 
difficulty. The mid-point may be conveniently obtained by adding one-half 
of the range of the class interval to the lower exact limit of that interval. 
Thus with the interval 100 to 104 the lower limit is 99.5 and one-half the 
class interval is 2.5. The mid-point is therefore 99.5 + 2.5, or 102. Consider 
a 10-point class interval written in the form 100 to 109. Here the lower limit 
is 99.5 and one-half the class interval is 5. The mid-point is then 99.5 + 5, 
or 104.5. Table 2.5, col. 3, shows the mid-points of the corresponding class 


intervals. 


2.6 CUMULATIVE FREQUENCY DISTRIBUTIONS 


Situations occasionally arise where our concern is not with the frequencies 
within the class intervals themselves, but rather with the number or per- 
centage of values “greater than” or “less than” a specified value. Such 
information may be made readily available by the preparation of a cumula- 
tive frequency distribution. The cumulative frequencies are obtained by 
adding successively, starting from the bottom, the individual frequencies. 
Table 2.6 shows the cumulative frequencies and cumulative percentages 


for a distribution of intelligence quotients. 


ho га 
Rost uma, TL T 2 


ie л Ак. 


М 


b: 


Table 2.6 Cumulative frequencies and cumulative percentage fre- 
quencies for distribution of intelligence quotients 
3 4 
ios 2 Cumulative 
interval Cumulative percentage 
(107%) Frequency frequency frequency 
130-134 1 106 100.0 
125-129 3 105 99.1 
120-124 4 102 96.2 
115-119 10 98 
110-114 8 88 ) 
105-109 15 80 15.5 
100-104 20 65 61.3 
95-99 14 45 i 
90-94 11 31 ў 
85-89 8 20 18.9 
80-84 6 12 11.3 
15-79 5 E т 
70-74 0 3 % 
65-69 E А 
106 


32 


BASIC STATISTICS 


2.7 TABULAR REPRESENTATION 


2.8 


Statistical data are frequently arranged and presented in the form of 
tables. Such tables should be designed to enable the reader to grasp with 
minimal effort the information which they intend to convey. While very 
considerable variety in the design of statistical tables is possible, a number 
of general rules should be observed. Kenney (1954) lists six such rules, and 
these are as follows: 


1 Every table must be self-explanatory, To accomplish this the title should be short, but not 
at the expense of clearness. 


2 Full explanatory notes, when necessary, should be inco; 
directly under the descriptive title and before the 
the table, 


rporated in the table, either 
body of the table, or else directly under 


3 The columns and rows should be arranged in logical order to facilitate comparisons. 


4 In tabulating long columns of figures, space should be left after every five or ten rows, 
Long unbroken columns are confusing, especially when one is comparing two numbers in 
а row but in widely separated columns, 


5 Ifthe numbers tabulated hav 


е more than three significant figures, 
grouped in threes, Thus, 


the digits should be 
one should write 4 685 732, not 4685732. 


6 Double lin 
Ifthe ta 


i 
ble nicely fills the width of a page, no side lines should be used. In 


Tables present 
bered and should 
Where they are ге erred to in the text 
inconvenience, 

The appropriate design of statistical tables can become a matter of some 


complexity. This is particularly the case where it is necessary to present 
data which are cross-classified in a variety of ways. 


GRAPHIC REPRESENTATION OF FREQUENCY DISTRIBUTIONS 


Graphic representation is often of 
the essential features of fre 


Reproduced, with permission, from John Е. Kenney and Е. $. Keeping, 


Mathematics of sta- 
tistics, part 1, 3d ed., copyright 1954, D. Van Nostrand Company, [ 


nc., Princeton, N.J, 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 33 


2.9 


Table 2.7 


trade publications, business reports, and scientific periodicals use graphic 
representation extensively. Graphic representation has been carefully 
studied, and much has been written on the subject. While graphic repre- 
sentation has many ramifications, we shall consider here only those 
aspects of the subject which are useful in visualizing the important proper- 
ties of frequency distributions and the ways in which one frequency dis- 
tribution may differ from another. 


HISTOGRAMS 

A histogram is a graph in which the frequencies are represented by areas 
in the form of bars. Table 2.7 presents measures of auditory reaction time 
for a sample of 188 subjects. 

Figure 2.1 shows the frequencies plotted in the form of a histogram. To 
prepare such a histogram proceed as follows. Obtain a piece of suitably 
cross-sectioned graph paper. Paper subdivided into tenths of an inch with 
heavy lines 1 in. apart is convenient. Draw a horizontal line to represent 
reaction time in seconds and a vertical line to represent frequencies. Select 
an appropriate scale, both for reaction time and frequencies. In the present 
case if we allow ту in. for each class interval and 75 in. for each unit of 
frequency, we obtain а graph roughly 6 in. long and 4 in. tall. The scale is 
arbitrary. The scale suggested in this case, however, results in a graph of 
convenient size. The mid-points of the interval are written along the hori- 
zontal base line, and the frequency scale along the vertical. For each class — 


Frequency distribution of auditory reaction times for a 
sample of 188 University of Chicago undergraduates* 


Class 

interval, — Mid-point Cumulative 
sec of interval Frequency frequency 

ooo 
.34-.35 .345 2 188 
.32-.33 .325 2 186 
.30-.31 .305 4 184 
.28-.29 .285 5 180 
.26-.27 .265 1 175 
.24-.25 .245 17 164 
.22-.23 .225 28 147 
.20-.21 .205 69 119 
.18-.19 .185 37 50 
.16-.17 .165 12 13 
.M-.15 .145 Е 1 
Total 188 


+ Adapied from L. L, Thurstone, 4 factorial study of perception, Uni- 


versity of Chicago Press, Chicago, 1944. 


Fig. 2.1 


2.10 


BASIC STATISTICS 


Frequency 


0.245 0.285 
Reaction time, seconds 


0.325 0.365 


Histogram for data of Table 2.7. Auditory reaction times Гог 188 
students. 


interval the corresponding frequency is plotted and a horizontal line drawn 
the full length of the interval. To complete the graph we may join the ends 
of these lines to the corresponding ends of the intervals on the horizontal 


mid-point of each interval at a height pro 


Observe that the frequency distribution in Fig. 2.2 is not a smooth con- 
tinuous curve, since the lines joining the various points are straight lines. If 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 35 


Fig. 2.2 


2.11 


Ke 


smr = 


Frequency 


EM S 


0.125 0.165 0.205 0.2. 0.285 
Reaction time, seconds 


0.325 0.365 


Frequency polygon for data of Table 2.7. Auditory reaction 


times for 188 students. 


we subdivide our intervals into smaller intervals, we shall of course obtain 
irregular frequencies, there being too few members in each interval. Con- 
sider, however, a circumstance where our intervals become smaller and 
smaller and at the same time the total number of cases becomes larger and 
larger. If we carry this process to the extreme situation where we have an 
indefinitely small interval and an indefinitely large number of cases, we 
arrive at the concept of a continuous frequency distribution. 1 


CUMULATIVE FREQU ENCY POLYGONS 


The drawing of a cumulative frequency polygon differs from that of a 
polygon in two respects. First, instead of plotting points corre- 
sponding to frequencies, we plot points corresponding to cumulative 
frequencies. Second, instead of plotting points above the mid-point of each 
interval, we plot our points above the top of the exact limits of the interval. 
This is done because we wish our graph to visually represent the number of 
g above or below particular values. In plotting the cumulative | 
bution shown in Table 2.7, we would plot the cumulativ 

gainst the top of ihe exact upper limit of the interval, th; à 
cy 186 against .335, and so on. Figure 2.3 shows the 


y distribution for the data appearing in the last column. 
k 


frequency 


cases fallin 
frequency distri 
frequency 188 а 
is, .355, the frequen! 
cumulative frequenc: 


of Table 2.7. 
We may convert our raw frequencies to percentages such that all the 
e 


frequencies added together add up to 100 instead of to the numb 
cases. We may then determine the cumulative percentage edi [ 
ме 


Fig. 2.3 


2.12 


Frequency. 


BASIC STATISTICS 


150 


0.125 0.165 0.205 0.245 0.285 0.325 0.365 
Reaction time, seconds 


Cumulative frequency polygon for data of Table 2.7. Auditory 
reaction times for 188 students. 


may then graph these frequencies and obtain thereby a cumulative per- 
centage polygon, or ogive. The advantage of this type of diagram is that 


from it we can re 


ad off directly the percentage of observations less than 


any specified value. 


SOME CONVENTIONS FOR THE CONSTRUCTION OF GRAPHS 


1 


2 


In the graphing of frequency distributions it is customary to let the hor- 
izontal axis represent scores and the vertical axis frequencies. 


The arrangement of the graph should proceed from left to right. The 
low numbers on the horizontal scale should be on the left, and the low 
numbers on the vertical scale should be toward the bottom. 


The distance along either axis selected to serve as a unit is arbitrary 
and affects the appearance of the graph. Some writers suggest that the 
units should be selected such that the ratio of height to length is 
roughly 3:5. This procedure seems to have some aesthetic advantages. 


Whenever possible the vertical scale should be so selected that a zero 
point falls at the point of intersection of the axes. With some data this 
procedure may give rise to а most unusual looking graph. In such cases 
it is customary to designate the point of inter. 
and make a small break in the vertical axis. 


Both the horizontal and vertical axes should be appropriately labeled. 


section as the zero point 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 37 


6 Every graph should be assigned a descriptive title which states pre- 
cisely what it is about. 


2.13 HOW FREQUENCY DISTRIBUTIONS DIFFER 


Comparison of a number of frequency distributions represented in either 
tabular or graphic form indicates that they differ one from another. An 
important problem in statistics is the identification and definition of proper- | 
ties or attributes of frequency distributions which describe how they differ. 
It is customary to designate four important properties of frequency dis- 
tributions. These are central location, variation, skewness, and kurtosis. 
These properties may be viewed either as descriptive of the frequency dis- 
tribution itself or as descriptive of the set of observations of which the dis- 
tribution is comprised. These alternatives are in effect synonymous, A 
frequency distribution is a particular kind of arrangement of a set of obser- 
vations. Central location, variation, skewness, and kurtosis may be dis- 
cussed either with direct reference to sets of observations or with refer- 
ence to the observations arranged in frequency-distribution form. 

Central location refers to a value of the variable near the center of the 
frequency distribution. It is a middle point. Measures of central locations 
are called averages. These are discussed in detail in Chap. 3 of this book. 

Variation refers to the extent of the clustering about a central value. If 
all the observations are close to the central value, their variation will be 
less than if they tend to depart more markedly from the central value. 
Measures of variation are discussed in Chap. 4. 

Skewness refers to the symmetry or asymmetry of the frequency dis- 
tribution. If a distribution is asymmetrical and the larger frequencies tend 
to be concentrated toward the low end of the variable and the smaller 
frequencies toward the high end, it is said to be positively skewed. If the 
opposite holds, the larger frequencies being concentrated toward the high” 
end of the variable and the smaller frequencies toward the low end, the dis- 
tribution is said to be negatively skewed. 

Kurtosis refers to the flatness or peakedness of one distribution in rela- 
tion to another. If one distribution is more peaked than another, it may be 
spoken of as more leptokurtic. If it is less peaked, it is said to be more 
platykurtic. It is conventional to speak of a distribution as leptokurtic if it 
is more peaked than a particular type of distribution known as the normal 
distribution, and platykurtic if it is less peaked. The normal distribution is 
spoken of as mesokurtic, which means that it falls between leptokurtic and 
platykurtic distributions. 

Table 2.8 presents hypothetical data illustrating frequency distributions 

- with different properties. The distribution in col. 2 is a symmetrical bino- 
mial, a type of distribution which is of much importance in statistical work 
and will be considered in detail in a later chapter. The distribution in col. 3 
has central frequencies which are greater than those for the binomial. It is 

aked than the binomial, and as far as kurtosis is concerned, it can 


more pe А E p T 
kurtic. The distribution in col. 4 has smaller central 


be said to be lepto 


801 801 801 821 801 801 801 801 821 N 
А с ot 08 9 БИ S $ Т 6-0 
? 9 SB 05 от 9I И 8 L 61-01 
S от ор от 56 9t 05 £T 14 62-00 
2 ST 05 v РТ 9І Sc Ob Se 68-08 
01 0c SI v т E Sc OF SE 67-0? 
05 Ob 01 01 © 91 05 £I 12 68-05 
08 Sc 9 0c от 9I TI 8 L 69-09 = 
05 ot [4 06 5 9T S е Т 62-02 
#"-—.——————-— ————————————— ————-— 
padpys-f panays рәтәув podpus-) Jopowig avjn3unjoay отіп omnansoidoT 121шои19 ]22349]u1 
ХраапояэмМ joanioq ]p211j9unu&g 88019 
0t 6 8 L 9 5 v $ 2 Т 


вәйюце зиэлодуир Jo зионпаызе!р Хопәпһәлу Zurieajsnjp vep |вопәшовХН gz ome, 


38 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 39 


Fig. 2.4 


2.14 


Frequency 
Frequency 


Score А Score B. 


Score € Score D d 


Four types of distributions: 4, rectangular distribution; B, 
bimodal distribution; С, U-shaped distribution; D, J-shaped dis- 


tribution. 


frequencies than the binomial and larger frequencies toward the extrem- 
ities. It can be spoken of as platykurtic. The distribution in соЁ 5. has 
uniform frequencies over all class intervals and is described as rectangu- 
lar. The distribution in col. 6 has two humps, or modes. It is said to be 
bimodal. In the distribution in, col. 7 the largest frequencies occur at the | 
extremities whereas the central frequencies are the smallest. Such a dis- 
tribution is said to be U-shaped. The distributions in cols. 2 to 7 are all sym- - 
metrical and have the same measures of central location although they 
differ in variation. Column 8 illustrates a positively skewed and col. 9 a - 
negatively skewed distribution. Extreme skewness leads to the type of dis- - 
tribution shown in col. 10, which is described as J-shaped. Figure 2.4 illus- 
trates a rectangular distribution, a bimodal distribution, a U-shaped dis- 


tribution, and a J-shaped distribution. 


THE PROPERTIES OF FREQUENCY 

DISTRIBUTIONS REPRESENTED GRAPHICALLY 

The differing characteristics of frequency distributions can be readily 
represented in graphical form. Consider the three distributions in Fig. 2.5. 
These distributions appear identical in shape. They are markedly different. 
however, in terms of the central values about which the observations в 
each distribution appear to concentrate; that is, they have different 


Fig. 2.6 


pu а eae, poe Ta eee 


BASIC STATISTICS 


Frequency 


Three frequency distributions identical in shape but with dif- 
ferent averages. 


averages although they may be identical in all other respects. Distribution 
А has a lower average than В and В than C. 

Now consider the distributions in Fig. 2.6. Inspection of these three dis- 
tributions suggests that while the observations in each case appear to con- 
centrate about the same average, they are nonetheless markedly different 
one from another. In the case of distribution А the observations appear to 
be more closely concentrated about the average than in the case of B, and 
the same applies to B in relation to C. Thus these distributions differ in 
variation. The observations in 4 are less variable than the observations in 
B, and those in B are less variable than those т С. 

Examine now the distributions in Fig. 2.7. These three distributions have 
different averages and possibly different measures of variation. They differ 
also in skewness. Distribution B is symmetrical about the average; that is, 
if we were to fold it over about the average, we should find that it had the 


Score 


Three frequency distributions with the same average but 


with different variation. 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATION 41 


Fig. 2.7 


Fig. 2.8 


Frequency 


Score 


Three frequency distributions differing in skewness. 


same shape on both sides. A and C are asymmetrical, the shape to the left 
of the average being different from the shape to the right. Distribution 4 is | 
positively skewed, the longer tail extending toward the high end of the 
scale. Distribution C is negatively skewed, the longer tail extending toward 
the low end of the scale. 

Consider now the graphical representation of kurtosis as shown in Fig. 
2.8. Distribution A is a symmetrical bell-shaped distribution known as the 
normal distribution. Distribution B is observed to be flatter on top than the 
normal distribution and is referred to as platykurtic, while distribution C is 
more peaked than the normal and is spoken of as leptokurtic. 

In the above discussion the meaning which attaches to the descriptive 
properties of collections of measurements arranged in frequency distribu- 
tions is largely intuitive and is derived from the inspection of distributions 
in tabular or graphic form. To proceed with the study of data interpretation 
we require precisely defined numerical measures of central location, varia- 


Frequency 


Score 


Three frequency distributions differing in kurtosis, 


BASIC STATISTICS 


tion, skewness, and kurtosis. Chapters 3 and 4 to follow are concerned with 
the more precise and formal delineati 


on of these properties, their 
numerical description, and calculation. 


Ев. 


EXERCISES 1 The following are marks obtaine 
| оп an English examination: 
42 88 37 75 98 93 73 62 
96 80 52 76 66 54 73 69 
83 62 53 79 69 56 81 75 
| 52 65 49 80 67 59 88 80 
p ^. 70 72 8 9 ф а w 
? Prepare a frequenc 
tion for these data 


d by a group of 40 university students 


y distribution and a cumulative frequency distribu- 
using a class interval of 5. 


Write down the exact limits and the mid-points of the class intervals for 
the frequency distribution obtained by answering Exercise 1 above. 


Write down an acceptable set of 


Ma >ч 
w 
5 
3 
g 
P 
m 
2 
е. 
E 
9. 
я 
8. 
E 


2 2 45 

Я NA л a x o. іт 2 

ла Ата Жаса е 
4 Ye т s по 

от in v... 6 18 4 


Prepare cumulative fre 
4 above. 


8 Frequency distributions of intelligence quotients are available for (a) a 

ndom sample of the Population at large and (b) a random sample of 

university students. In what ways and for what reasons might you 
expect these two distributions to differ? 


FREQUENCY DISTRIBUTIONS AND THEIR GRAPHIC REPRESENTATIO; 


9 In what ways might you possibly expect the frequency distributi 
marks on a history examination to differ from that for a mathemat 


examination? 


3.1 


AVERAGES 


INTRODUCTION 


аге sometimes viewed 


3.2 


[3.1] 


3.3 


AVERAGES 45 


as appropriate measures for ordinal and nominal variables, respectively, | 
although they can also be used with interval and ratio variables. 


THE ARITHMETIC MEAN 
By definition the arithmetic mean is the sum of a set of measurements 
divided by the number of measurements in the set. Consider the following 
measurements: 7, 13, 22, 9, 11, 4. The sum of these six measurements is 66. 
The arithmetic mean is therefore 66 divided by 6, or 11. 

In general, if N measurements are represented by the symbols X,, Х», 
Хз, . . . , Хуу the arithmetic mean in algebraic language is as follows: 


N 
Xy XS Xs+ Xy 5, х 
N N 


The symbol X, spoken of as X bar, is used to denote the arithmetic mean of. 


N 
the values of X. V, the Greek letter sigma, describes the operation of | 
ізі : 
summing the № measurements. The summation extends from i=1 to 


i=N. 


CALCULATING THE MEAN FROM FREQUENCY DISTRIBUTIONS 


Consider a situation where different values of X occur more than once. Тһе” 
arithmetic mean is then obtained by multiplying each value of X by the 
frequency of its occurrence, adding together these products, and then 
dividing by the total number of measurements. Consider the following 
measurements: 11, 11, 12, 12, 12, 13, 13, 13, 13, 13, 14, 14, 15, 15, 15, 16, 16, 
17, 17, 18. The value 11 occurs with a frequency of 2, 12 with a frequency of - 
3, 13 with a frequency of 5, and so on. These data may be written as 


follows: 
X, л т 
MM 
18 1 18 
17 £ 9» 
16 2 32 
15 з 45 К 
14 2 28 | 
13 5 65 
12 3 36 
11 2 2 


Total 20 280 


796 


TM. є 


[3.2] 


ТаЫе 3.1 


BASIC STATISTICS 


This is a frequency distribution with a class interval of 1. The symbol f; is 
used to denote the frequency of occurrence of the particular value X;. Mul- 
tiplying each value X; by the frequency of its occurrence and adding 
together the products fX, we obtain the sum 280. The arithmetic mean is 
then 280 divided by 20, or 14.0. 

In general, where X, X», Ху, ... , Xy occur with frequencies Л.Л, 


Ja» - - - fes Where kis the number of different values of X, the arithmetic 
mean 


k 
ХПА ++. лд, DIM 
: N М 


Observe that here the Summation is over k terms, the number of different 


k 
= У AiX;. The above dis- 


ізі 


int of the interval may be used to represent all 
nterval. We assume that the variable X takes 
the intervals, and these are 


1 2 3 4 
{ " Fre 
бый Jd ptr Freins Ж midpoint 

1 
45-49 47 1 47 
40-44 42 2 84 
35-39 37 3 111 

30-34 32 6 19 
25-29 21 8 А 
20-24 22 17 А 
15-19 17 26 442 
10-14 12 п 132 
5-9 7 2 14 
0-4 22 -0 0 
Total 76 1,612 


k 
Ул, = 1,612 Y= 1,612/76 = 21.21 


ісі 


AVERAGES 47 


ij 
mid-points of all intervals. Second, multiply each mid-point by the corre- | 
sponding frequency. Third, sum the products of mid-points by frequencies. | 
Fourth, divide this sum by N to obtain the mean. To illustrate, consider | 
Table 3.1. ү, 
; Тһе mid-points of the intervals X; appear in col. 2. Тһе frequencies 4 


appear in col. 3. The products of the BUDE by the frequencies fX; аге 


shown in col. 4. The sum of these products Ў ЛХ; is 1,612, Nis 76, and the ; 
ici ¥ 
mean X is obtained by dividing 1,612 by 76 and is 21.21. [ 


3.4 THE MEAN OF COMBINED GROUPS 


Consider a group of n, measurements with mean X, and a set of Ng 4 
measurements with mean Х,. Denote the ith measurement in the first. 
group by the symbol X;; and the ith measurement in the second group by | 
the symbol Ху. The first subscript identifies the particular measurement. ^ 
The second subscript identifies the group. Thus Х would in this notatio 
identify the seventh measurement in the second group of measurements. 
Let n, + n; = N, the total number of measurements in the two groups. The I 
mean of all the measurements in the two groups taken together is 


2 Ха + » Ха _ mks + п„Х» 


іл 
| [3.3] = ni A N 


To illustrate, the mean of the four measurements 1, 3, 8, and 8 is 5. The 
mean of the six measurements 4, 4, 5, 6, 8, and 15 is 7. The mean of all ten 
measurements taken together is then > 


= 4x5+6X7 
QAX5T6X7. 69 
x 10 5 


The above result may be extended to apply to any number of groups. With | 
more than two groups, say, k, we simply multiply the number of cases in 
each group by the group mean, sum the К products thus obtained, and 
divide by N, the total number of measurements in the А groups. Thus with & з, 


groups 
Ex 
» njXi 
я ial 
[3.4] X= N 


3.5 SOME PROPERTIES OF THE ARITHMETIC MEAN 


The sum of the deviations of all the measurements in a set from their arih 
metic mean is zero. The arithmetic mean of the measurements 7, 13, 22, 9, 
11, and 4 is 11. The deviations of these measurements from this mean ar 

=4, 2, 11,—2, 0, and —7. The sum of these deviations is zero, 4 


[3.5] 


[3.6] 


[3.7] 


BASIC STATISTICS 


Proof of this result is as follows: 


x N 
Observe that since X — (> x) it follows that N X — NX. Also adding 
i=1 i=1 

X, the mean, N times is the same as multiplying X by №; thus if X is 11 and 
N is 6, we observe that 1l+11+11+114+114+11=6~x 11 = 66. 

The sum of squares of deviations about the arithmetic mean is less than 
the sum of the squares of deviations about any other value. The deviations 
of the measurements 7, 13, 22,9, 11,4 from the mean 11 are —4, 2, 11, —2, 
0, —7. The squares of these deviations are 16, 4, 121, 4,0, 49. The sum of 
squares is 194. Had any other origin been selected, the sum of squares of 
deviations would be greater than the sum of squares about the mean. 
Select a different origin, say 13. The deviations are —6, 0, 9, 4, 2, —9. 


Squaring these we have 36, 0, 81, 16, 4, 81. The sum of these squares is 218. 
which is greater than the s 


other origin will demonstrate the same result. 


at it is the centroid, or center of 
ed, the mean is the central value 


Ze- (е) = (X, — Y).— 6 
Squaring and summing over N observations we obtain 
N - N N N 
= іж | - 
21% У U- 3 e$ ай og, py 


ізі = 


Because the sum of deviations 


асан about the mean 
right is zero. Also с? 


summed N times js Ne, 


È X- (Cr - y (X, ўуз yer 


18 zero, the third term to the 
and we write 


ple of size N is an estimate of a population 


mean, which is the value obtained where it is possible to measure all 


3.6 


AVERAGES 49 


members of the population. The mean has the property that for most dis- 
tributions it is a better, or more accurate, or more efficient, estimate of the 
population mean than other measures of central location such as the 
median and the mode. This is one reason why it is most frequently used. 
Proof of this result is beyond the scope of this book. 

Reference has been made to a number of properties of the arithmetic 
mean, What importance attaches to these properties, or why should they 
be discussed? The fact that the sum of deviations about the mean is zero 
greatly simplifies many forms of algebraic manipulation. Any term ` 
involving the sum of deviations about the mean will vanish. The fact that 
the sum of squares of deviations about the mean is a minimum in effect 
implies an alternative definition of the mean; namely, the mean is that 
measure of central location about which the sum of the squares is a 
minimum. In effect, the mean is a measure of central location in the 
least-squares sense. The method of least squares is of considerable impor- 
tance in statistics and is used, for example, in the fitting of lines and 
curves. The mean may be regarded as a point located by the method of 
least squares. The properties pertaining to change of origin and change of 
unit are of importance in that they lead to simplified methods of computing 
the mean where a fairly large number of observations is involved. The fact - 
that the sample mean provides a better estimate of a population parameter 
than other measures of central location is of primary importance. 
Throughout statistics we are concerned with the problem of making state- 
ments about population values from our knowledge of sample values. - 
Obviously, the more accurate these statements are, the better. 


THE MEDIAN 
Another commonly used measure of central location is the median, The 
median is a point on a scale such that half the observations fall above it and 
half below it. The observations 2, 7, 16, 19, 20, 25, and 27 are arranged in 
order of magnitude. Here N is an odd number and the median is 19; three 
observations fall above it and three below it. If another observation, say, 
31, is included, the median is then taken as the arithmetic mean of the two 
middle values 19 and 20; that is, the median is (19 + 20)/2, or 19.5. Con- 
sider a situation where certain values of the variable occur more than once, 
as, for instance, with the observations 7, 7, 7, 8, 8, 8, 9, 9, 10, 10. The three 
8's are assumed to occupy the interval 7.5 to 8.5. The median is obtained by 
interpolation. In this instance we must interpolate two-thirds of the way 
into the interval to obtain a point above and below which half the observa- 
tions fall. The median is then taken as 7.5 + 0.66 = 8.16. 

With a frequency distribution represented in graphical form, the ordi- 


nate at the median divides the total area under the curve into two equal 


parts. 


3.7 


[3.8] 


BASIC STATISTICS 


CALCULATING THE MEDIAN FROM 
FREQUENCY DISTRIBUTIONS 


In calculating the median from data grouped in the form of a frequency dis- 
tribution the problem is to determine a value of the variable such that 
one-half the observations fall above this value and the other half below. 
The method will be illustrated with reference to the data in Table 3.2 

First, record the cumulative frequencies as shown in col. 3. Second, 
determine №/2, one-half the number of cases, in this example 38. Third, 
find the class interval in which the 38th case, the middle case, falls. The 
38th case falls within the interval 15 to 19, and the exact limits of this 
interval are 14.5 and 19.5. Clearly, the 38th case falls very close to the top 
of this interval because we know from an examination of our cumulative 
frequencies that 39 cases fall below the top of this interval, that is, below 
19.5. Fourth, interpolate between the exact limits of the interval to find a 
value above and below which 38 cases fall. To interpolate, observe that 26 
cases fall within the limits 14.5 and 19.5, and we assume that these 26 
rectangular fashion between these exact 


interval we require is $8 which 
i » or 4.81. We add this to the lower limit of the 
interval to obtain the median, which is 14.50 + 4.81, or 19.31, 

Let us summarize the steps involved: 


Compute the cumulative frequencies, 
Determine /М/2, one-half the number of cases. 


Find the class interval in which the middle case falls, and determine 
the exact limits of this interval. 


е on the scale above and below which one-half 
the total number of cases falls. This is the median, 


For the student who has difficulty in following the above a simple 
formula may be employed. 


Median = L EA 
Ж 


where L = exact lower limit of interval containing the median 
Е = зит of all frequencies below L 


Ín = frequency of interval containing median 
N = number of cases 


h = class interval 


In the present example L = 14.5, F = 13, fn = 26, N + 76, and л = 5, We 


Table 3.2 


ej era Vale canned Led. 


AVERAGES 51 


Frequency distribution of psychological 


test scores 


1 2 3 
Class Cumulative 
interval Frequency frequency 

45-49 1 76 
40-44 2 75 
35-39 3 73 
30-34 6 70 
25-29 8 64 
20-24 17 56 
15-19 26 39 
10-14 11 13 

5-9 2 2 

0-4 -0 -0 

Total 76 


then have 


обе 165-62 x х5= 19.31 


THE MODE 


{егеп 


occurring value. Cons 
13, 14, 14, 14, 15, 15, 15, 16, 16, 17, 17, 18. Here the value 13 occurs 5 


times, more frequently than any other value; hence the mode is 13. 

In situations where all values of X occur with equal frequency, wher 
that frequency may be equal to or greater than 1, no modal value can b 
calculated. Thus for the set of observations, 2, 7, 16, 19, 20, 25, and 27 n 
mode can be obtained. Similarly, the observations, 2, 2, 2, 7, 7, 7, 16, 16, 16 
19, 19, 19, 20, 20, 20, 25, 25, 25, 27, 27, 27 do not permit the calculation of a 
modal value. All values occur with a frequency of 3. Я 

In the case where two adjacent values of X occur with the same 
frequency, which is larger than the frequency of occurrence of other value 
of X, the mode may be taken rather arbitrarily as the mean of the two 
adjacent values of X. Consider the observations 11, 11, 12, 12, 12, 13, 13, 
13, 13, 14, 14, 14, 14, 15, 15, 16, 16, 17. 18. Here the values 13 and 14 both 
occur with a frequency of 4, which is greater than the frequency of occur: 
rence of the remaining values. The mode may be taken as (13 + 14)/2, d 


13.5. 


52 


BASIC STATISTICS 


Where two nonadjacent values of X occur such that the frequencies of 
both are greater than the frequencies in adjacent intervals, then each value 
of X may be taken as a mode and the set of observations may be spoken of 
as bimodal. Consider the observation 11, 11, 12, 12, 12, 13,18:13; 13; 13, 
14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 17, 17. 18. Here the value 13 occurs five 
times, and this is greater than the frequency of occurrence of the adjacent 
values. Also 15 occurs four times, and this is also greater than the 
frequency of occurrence of the adjacent values. This set of observations 
may be said to be bimodal. 

With data grouped in the form of a frequency distribution the mode is 
taken as the mid-point of the class interval with the largest frequency. 

The mode is a statistic of limited practical value. It does not lend itself. 
readily to algebraic manipulation. It has little meaning unless the number 
of measurements under consideration is fairly large. 


COMPARISON OF THE MEAN, MEDIAN, AND MODE 


The arithmetic mean may be ге 
central location for interval and ra 
the variable are incorporated in i 


same median, namely 20, although 


the greatest frequency, is а nominal statistic. 


f the variable or their 


ade from heavy card- 


€st point of the curve. 
m i ewed, these three mea- 
sures do not coincide. Figure 3.1 shows the mean à 


than the median, which 
is negatively skewed the reverse relation holds, 


À question may be raised regarding the appropriate choice of a measure 


4 


AVERAGES 53 


Frequency 


Mode | Mean 
Median Е 


Relation between the mean, median, and mode т a positively 


skewed frequency distribution. 


of central location. In practical situations this question is rarely in doubt. — 
The arithmetic mean is usually to be preferred to either the median or the | 
mode. It is rigorously defined, easily calculated, and readily amenable to 
algebraic treatment. It provides also a better estimate of the corresponding 
population parameter than either the median or the mode. 

The median is, however, to be preferred in some situations. Observa- 
tions may occur which appear to be atypical of the remaining observations - 
in the set. Such observations may greatly affect the value of the mean. Con- 
sider the observations 2, 3, 3, 4, 7, 9, 10, 11, 86. Observation 86 is quite 
the remaining observations, and its presence greatly affects the 
The mean is 15, a value greater than eight of the nine 
ian is 7. Under circumstances such as this it may У 
ргоуе advisable in treating the data to use statistical procedures that are | 
based on the ordinal properties of the data in preference to procedures that - 
particular values of the variable and may be grossly 
| values. The median, an ordinal statistic, may under 
be preferred to the mean. In the above example the set 
of observations is grossly asymmetrical. If the distribution of the variables . 
shows gross asymmetry. the median may be the preferred statistic, | 
because, regardless of the asymmetry of the distribution, it can always be | 
interpreted as the middle value. j 

For a strictly nominal variable the mode, the most frequently occurring 
class or value, is the only “most typical” statistic that can be used. It is 
rarely used with interval, ratio, and ordinal variables where means and 


medians can be calculated. 


SS 


e the frequencies of the six possible events are " 
s 


atypical of 
value of the mean. 
observations. The med 


incorporate all the 
affected by atypica 
such circumstances 


In 100 rolls of a di 


follows: 


EXERCISES l 


6 Compute medians for the foll 


BASIC STATISTICS 


кою we UD 
8 


Compute the arithmetic means for this distribution. 


2 The following is a frequency distribution of examination marks: 


Class 
interval fi 
—————— 
90-94 1 
85-89 4 
80-84 2 
75-79 8 
70-74 9 
65-69 14 
60-64 6 
55-59 6 
50-54 4 
45-49 3 
40-44 3 
N=60 
——— 


Compute the arithmetic mean. 


3 How does the addition of a 


constant and multiplication by a constant 
affect the arithmetic mean? 


Squares of deviations from an arbitrary origin 
of 60 (Eq. [3.7])? ' 


owing data: 
a 3,7, 15, 26, 51 


b 3, 9, 22, 25, 31, 46 
с 6, 25, 30, 35, 45, 64 


Ч 12, 19, 24, 24, 29, 42 
е 4,4,5,5,6 


7 Compute modes for the following data: 
a 2,2, 5,5, 5,6, 6,6, 7, 8, 12 
b 3,3,4,4, 4,5, 7, 7, 9, 12 


8 Compute medians and modes for the data in Exercises 1 and 2 ab 


56 


4.1 


MEASURES OF VARIATION, 
SKEWNESS, AND KURTOSIS 


INTRODUCTION 


Of great concern to the Statistician is the variation in the events of nature. 
The variation of one measurement from an 


с measurements such as height, 
forearm, and angular separation 
en individuals. Anatomical and 


measurements be described? Con- 
wo samples: 


MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS 57 · 


Sample A 10 12 15 18 20 
Sample B 2 8 15 22 28 


We note that the two samples have the same mean, namely, 15. Si i 
spection indicates, however, that the їпедыйтешедїв їп an 1 бта. = 
variable than those in sample 4; they differ more one from : th Mus 
| the possible measures used to describe this variation 2. аын 
mean deviation, and the standard deviation. The most Sinan жы Ға 
is 


the standard deviation. 


4.2 THE RANGE 

The range is the simplest measure of variation. In any sample of 
ments the range is taken as the difference between the сы a domi 
А measurements. The range for the measurements 10, 12, 15, 18 $ pem 
minus 10, or 10. The range for the measurements 2, 8 Б 22 ý a E te а 
minus 2, ог 26. The measurements in the second set mate cl ed soe ы s 
greater variation than those in the first set, and this reflects it HA Че 
greater range. The range has two disadvantages. First, for Mes же кы 
is an unstable descriptive measure. The sampling variance if eh ie" 
small samples is not much greater than that of the standard d S fon 
increases rapidly with increase in №. Second, the range is not p but 
cept under special circumstances. For eae a 
taper to zero at the extremities a better chance exists of obtainin, de 
| values for large than for small samples. Consequently, ranges S iom 
| on samples composed of different numbers of cases are not diren е 
| rable. Despite these disadvantages the range may be effectively u dere 
application of tests of significance with small samples. ыы” 


of sample size, ex 


4.3 THE MEAN DEVIATION 
Consider the following measurements: 


| 
Sample 4 8 8 8 8 8 
| Sample В 4 7 10 13 
Sample C 1 5 20 25 29 
| 
Intuitively, the measurements in sample А are less variable than th i 
тозе in 


are less variable than those in С. Indeed, the me 
> asure- 


B, which in turn 
ation at all. The means of the three samples are 8 
» 


in A exhibit no vari 


ments 
7, and 16. If we express the measurements as deviations fr i 
i о 
means. we obtain m their sample 
Sample 4 0 0 0 0 0 
Sample В —6 -3 о +з +6 


Sample С zB -ңП +4 49 +13 


: [4.1] 


F "iua 


4.4 


BASIC STATISTICS 


Inspection of these numbers suggests that as variation increases. the 
departure of the observations from their sample mean increases. We may 
use this characteristic to define a measure of variation. One such measure 
is the mean deviation. The mean deviation is the arithmetic mean of the 
absolute deviations from the arithmetic mean. An absolute deviation is a 
deviation without regard to algebraic sign. To obtain the mean deviation we 
simply calculate the deviations from the arithmetic mean, sum these, 
disregarding algebraic sign, and divide by N. For sample А above, the mean 
deviation is zero. For sample B the mean deviation is (6 + 3 +0 + 3+ 
6)/5 = = 3.6. For sample С the mean deviation is (15+ 11+4+9 + 
13)/5 = % = 10.4. 


The mean deviation is given in algebraic language by the formula 


- ХІХ 
Мір TA 


Here X — X is a deviation from the mean and |X — X| is a deviation 
without regard to algebraic sign. The bars mean that signs are ignored. 

Hitherto, symbols above and below the summation sign E have been 
used to indicate the limits of the summation. In the above formula for the 
mean deviation these symbols have been omitted 
clearly understood to extend over the 
subsequent chapters symbols indica 
convenience, be omitted where th 


‚ the summation being 
N members in the sample. In this and 
ting the limits of summation will, for 


ese are understood clearly from the 
context to extend over V sample members. Where any possibility of doubt 
could exist, the symbols above and below the summation sign will be 
inserted. 


The mean deviation is infrequently used. It is not readily amenable to 
algebraic manipulation. This circumstance stems from the use of absolute 
values. In general, in statistical work the use of absolute values should be 
avoided, if at all possible. It is of interest to note that the sum of absolute 


jan is a minimum Consider the numbers 1, 5, 20, 
25, 29. The median is 20. The sum of absolu eviation 
lute d 9+15+0 
е lations is 1 


[4.2] 


[4.3] 


MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS 


recommend it. An alternate, and in general preferable: procedure i ус 
square the deviations about the mean, sum these squares. and use thi ud 
of squares in the definition of a measure of variation. For exam le ES 
mean of the measurements 1, 4, 7, 10, and 13 is 7. The deviations el к E 
mean are —6, —3, 0, +3, and +6. The squares of these deviations are 36 E 
0. 9, and 36. The sum of squares is 90. ы 
The sum of squares of deviations about the mean is used in the definitio | 
of a statistic known as the variance. Two methods exist for definin ae : 
variance. Both are in common use. One method defines the шкы. E 
dividing the sum of squares of deviations about the mean by N , the aimee 


of cases. Denote this statistic by s?. Thus 
m 


m Ей 
N 
In the illustrative example in the paragraph above, the sum of squares of 
deviations about the mean was 90, and the variance, according to formu | 
[4.2], is s? = 90/5 = 18. 
An alternate method of defining the variance is to divide the sum of | 


squares by N — ] rather than N. Thus, according to this definition, the 
" , 


variances 57 is given by 
A 


c Е 
и 
N-—1 

Both formulas, [4.2] and [4:3], provide alternate definitions of o 
variance. These formulas have no derivation, but are obtained by a pro үе, 
of plausible reasoning. The reader will note that no notational distinction 
made here between the variance defined by formula [4.2] and that d fs =. 
by formula [4.3]. efined 
What is the essential difference between the variance as defined b 
formula [4.2] and the variance as defined by formula [4.3]? To und M 
stand the answer to this question the reader should recall that in Chapt ега s 
a distinction was made between sample values, or estimates, and a i p 
arameters. Both formulas, [4.2] and [4.3], nnd i 
ulation variance g°. For certain algebraic reasons when зш 
divide £(X — X)* by N we obtain a biased estimate of o°. This esti a 
will show a systematic tendency to be less than c*. It is biased. wi 
— ]; however, we obtain an unbiased ава 


divide Z(X — X)? by N 


о?. Such an estimate will show no systematic tendency to be either greate; 
Oa 


than or less than 05, 
In all situations where an estimate of a population variance о? is 
із | 


е statistic s? which divides the sum of squares Бу N — 1 and Y 
d. In the great majority of situations discussed in ii bad 
an unbiased estimate is required. In some situations, involving des 18, ook 
statistics, convenience and simplicity dictate the use of an бе ата 
divides the sum of squares by N and not N — 1. In general in thi ^ which 
symbol s? will be used to refer to the unbiased estimate obtained 5, кү С 
ang 


tion values or P 
mates of a popu 


required, th 
N should be use 


[4.4] 


N(N-1/2- 


BASIC STATISTICS 


the sum of squares by N = 1, as in formula [4.3]. In a few situations the def- 
inition of formula [4.2] will be used. All such instances will be clearly 
specified in the text. 

The reader should note that a population variance is defined as 
а? = У(Х — и)" [М,, where и. is the population mean and V, is the number 
of members in the population. Thus in any situation where the variance of a 
complete population is required we divide the sum of squares by the 
number of members in the population. 

While N is the number of measurements or observations, the quantity 
N — 1 is the number of deviations about the mean that are free to vary. To 
illustrate, consider the measurements 7, 8, and 15. The mean is 10, and the 
deviations about the mean are —3, —2, +5. The sum of deviations about 
the mean is лего; that is, (—3) + (-2) + (5) 20. Because this is so, if any 
two of the deviations are known, the third deviation is fixed. It cannot vary. 
In this example, the sum of squares of deviations about the mean is 9 +4 + 
25 —38. Although this sum of Squares is obtained by adding together three 
squared deviations, only two of these squared deviations can exhibit 
freedom of variation. The number of values that are free to vary is called 
the number of degrees of freedom. A quantity of the kind E(X — X)? is said 


to have associated with it N — 1 degrees of freedom, because № — 1 of the 
N squared deviations of whi 


useful and general concep 
elsewhere in this book. 


ment and every ot 
—6, —3. Note that the sign of the difference 


= ral, in algebraic notation it 
may be shown that 


У(Х, = О 952 


[4.5] 


[4.6] 


4 4 


MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS en 


where the summation is understood to extend over N(N — 1)/2 dif- б 
ferences. This result means that 57 is а descriptive index of how different қ 
each value is from every other value; in fact it is an average of the squared E 
differences divided by 2. 

The variance is a statistic in squared units. If X — X is a deviation in ý 
feet, then (X — X)? is a deviation in feet squared. For many purposes itis ' 
desirable to use a measure of variation which is not in squared units, but in 
units of the original measurements themselves. We obtain this result by 
taking the square root of either formula 4.2 or formula 4.3. This statistic is 
called the standard deviation. Thus 


les pu xy 


or 


_ МЕХ 
sm N-1 


Here again, as with the variance, the standard deviation obtained using № 
is biased, and that obtained using N — 1 is unbiased. 


AN ILLUSTRATIVE APPLICATION 

g of the nature of the variance and the standard deviation 
will be enriched by considering illustrative situations where these statistics 
are of interest. Consider a simple experiment designed to investigate the | 
effect of a drug on а cognitive task such as coding. An experimental group 
of subjects, who receive the drug, and a control group, who do not receive 
the drug, are used. Each group contains 10 subjects. Let us assume the 
the coding task for the two groups are as follows: 


Our understandin; 


scores on 


5 7 17 31 45 47 68 85 96 99 


Control 29 36 37 42 49 58 62 63 69 70 


Experimental 


The mean score for the experimental group is 50.0, and that for the control. 

51.5. The investigator might be led to conclude from inspecting these 
means that the drug had little or no effect on the performance of the sub- 
jects. The standard deviations for the two groups are, respectively 35.63 
and 14.86, the experimental group being much more variable in perforin 
ance than the control group. Quite clearly the treatment is exerting a sub- 
stantial influence оп the variation in performance, although its influence on 
level of performance is negligible. In the analysis of experimental data the 
investigator must attend to, and if possible interpret, differences in the 
standard deviation, ог variance, as well as differences in the arithmetic 


mean. 


4.6 


4.7 


CALCULATING THE SAMPLE VARIANCE AND THE 
STANDARD DEVIATION FROM UNGROUPED DATA 


For purposes of calculation, it is convenient to write the variance and the 
standard deviation in a different form. The variance may be written as 


D(X — X)? 
dm еі 4 
У(Х + Х° — xX) 
ja N-1 
УХ + NX? — INK? 
a N-1 
| ХХ? – NX: 
|». N-1 


In this derivation note that the summation of X? 


it the su i over N is simply NX*; 
also the summation of 2XX is 2ХУХ = 2№Х?. Th 


e standard deviation is 


given by 
OEN 
S NUNT 


Thus to calculate the standard deviation using this formula, we sum the 
squares of the original observations, subtract from this N times the square 
of the arithmetic mean, divide by N — 1, and then take the square root. For 


example, the five observations 1, 4, 7, 10, and 13 have a mean of 7. The 
Squares of these observations are 1,1 


squared observations is 335. The va 


.IX-NY 385-5хт _ 
WaT gay 22.50 


and the standard deviation is V 22.50 = 4.74, 
An alternative formula for the stand 

calculation of the arithmetic mean and m 

computational purposes is 


[X (УХ 
“EN м-в 


This formula requires one operation of div. 
In computing a standard deviation 


52 


ard deviation which avoids the 
ay, therefore, be useful for certain 


ision only, 


THE EFFECT ON THE STANDARD DEVIATION 
OF ADDING OR MULTIPLYING BY A CONSTANT 


If a constant is added to all the obser: 


vations in a sample, the standard 
deviation remains unchanged. An exa. 


miner may conclude, for example, 


ый TE 
BASIC STATISTICS \ | 


а.в 


[4.8] 


[4.9] 


"um 

. М 

MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS 63 | 
f 


that an examination is too difficult. He may decide to add 10 points to E 
the marks assigned. The standard deviation of the original marks will bel 
the same as the standard deviation of marks with the 10 points added. This. y 
result follows directly from the fact that if X is an observation, the corre- | 
sponding observation with the constant c added is X + c. ИХ is the mean | 
of the original observations, the mean with the constant added is X + c. p 
deviation from the mean of the observations with the constant added is ; 
then (X +c) —(X+0), which is readily observed to be equal to X — X. - 
Since the deviations about the mean are unchanged by the addition of a | 
constant, the standard deviation will remain unchanged. To illustrate, by . 
adding a constant, say 5. to the measurements 1, 4, 7, 10, and 13, we obtain i 
6, 9, 12, 15, and 18. The mean of the original measurements is 7, and the 
mean of the measurements with the constant added is 7 + 5, or 12. The - 
deviations from the mean are in both instances the same, namely, —6, E 
0, --3, and +6. The standard deviation in both instances is 4.74. 1 
If all measurements in a sample are multiplied by a constant, the stan- - 
dard deviation is also multiplied by the absolute value of that constant. If _ 
the standard deviation of examination marks is 4 and all marks are mul | 
tiplied by the constant 3, then the standard deviation of the resulting marks. 
is 3 X 4 — 12. To demonstrate this result, we observe that if X is the mean 
of a sample of measurements, the mean of the measurements multiplied b: 
c is сї. A deviation from the mean is then cX — cX — c(X — X). By 
squaring, summing over N observations, and dividing by N — 1, we obtain р 


Zex- eX)? _ ex 7 X* as < 
N-1 ДА. > 
ments are multiplied by а constant с, the variance a 


Thus if all measure puc 
multiplied by с? and the standard deviation by the absolute value of c. If c is 


a negative number, say —3, 5 is multiplied by the absolute value 3. By way 
of illustration, the measurements 1, 4, 7, 10, 13 have a mean of 7, a variance - 
of 22.50, and a standard deviation of 4.74. If the measurements are mul- 
tiplied by the constant 5, we obtain 5, 20, 35, 50, 65. The mean is now 5 X 711 
ог 35. Тһе deviations from the mean are —30, —15, 0, +15, +30. Squaring 
these we obtain 900, 225, 0, 225, 900. The sum of squares is 2,250, dl 
variance is 562.50, and the standard deviation is 23.72, whereas 5 times the 
ard deviation of 4.74 is 23.70. The slight discrepancy results | 


original stand à 
ding of decimals. 


from the roun 
i 


STANDARD DEVIATION OF THE FIRST N INTEGERS А 
We state without proof that the sum of squares of the first N integers is 


M+) QN+). 
6 


‹ 
4 
} 


and the standard deviation of the first № integers is | 


_ (КІЗУ. ` 
== 12 s 


64 


BASIC STATISTICS 


i ^d rs 1, 2, 3, 4, 5. 6, 7, 8, 9, 10. Applying the above 
Gets i NN d is 385 and the standard deviation is 3.03. 
These results may be readily checked by direct calculation. 

Formula [4.9] is obtained directly from the definition of the standard 
deviation as s= VX(X — X) (N — 1). If the standard deviation is 
defined as s = VX(X — ХУ һе standard deviation of the first N inte- 
gers is s = V(N? — 1)/12. 

These formulas are of particular 
ranks (Chap. 21). Where ranks 
by the first N integers. 


use in relation to problems involving 
are used, the observations are represented 


4.9 THE VARIANCE ОЕ COMBINED GROUPS 


[4.10] s? = ( 


On occasion we know the means 
ments and may wish to 
We may, for example, have 
two classes of university students and ma 


mean, and variance for one group be п}, Y, 


the mean and var 
Also, let n, + n, = М, X,-X= d,, and 
proof that the variance of th 


extended from two to an 


У number of groups, say, k. 
measurements 1, 6, 8, 10, 13, 18, and 21 have a mean 


To illustrate, seven 
ofllanda variance of 48.00; that is, п, 
five measurements 1, 4, 7, 10, 13 havea 


that is, n, = 5, X, — 7, and s? = 22:50, Тһелпей 
taken together, the combined £roup, is 9.33. The quantity 
d, —]11— 9,33 = 1.67 an 


d 4,-7-933- —2.33. The variance of the 
combined group is 


= 7, X, = 11, and 51? = 48.00. The 


$= [(7-1) x 48 + (5—1) X 22.50 + 7(1.67)? 


+5(—2.33)?] = 38.61 
The standard deviation s = yV 38.6 


1 = 6.21. The above result may be 
checked by direct calculation, 


4.10 STANDARD SCORES 


MEASURES OF VARIATION, SKEWNESS, AND KURTOSIS 65 
5 


s. The subtraction of the mean from all measurements in a sample do 
change the standard deviation. Standard scores have zero mean ar d к. 
теа М E 
standard deviation. Аз previously shown, if all measurements in a sa Y 
ide m т 
are multiplied by a constant, the standard deviation is also multi lied by 1 
that constant. The deviations from the mean X — X havea sete devi 
: пе 4 E evia- 
cm s. If all deviations are divided by s. which amounts to multiplying Bl 
the constant 1/5, the standard deviation y 
Е 5 of th i ist 
i ais he scores thus obtained is 
To illustrate, the following observations have been expressed i 
raw-score, deviation-score, and standard-score form , | a 


Individual X x 5 
А 3 -7 = 
В 6 -4 —.63 
С 7 = —.47 
D 9 —1 —.16 у 
Е 15 5 79 
Е 20 10 1.58 j 
Sum 60 -00 00 r 
Mean 10 .00 00 
5 6.32 6.32 1.00 
ЕН сне 


Because standard scores have zero mean and unit standard deviati 
they are readily amenable to certain forms of algebraic maninila 
Many formulations can be derived more conveniently using st P 
scores than using raw or deviation scores. в standard 

The use of standard scores means, in effect, that we are using th 
dard deviation as the unit of measurement. In the above example w-— 
A is 1.11 standard deviations, or standard deviation units FR Е 
hile individual F is 1.58 standard deviation units above “hee the | 
es are frequently used to obtain comparability of ste 
y different procedures. Consider examinations in d 
lied to the same group of individuals anda nglis 
eviations to be as follows: ssume the 


ual 
mean, W 

Standard scor 
tions obtained b 
and mathematics арр 
means and standard d 


x 


x в 
uM 
English 65 8 

52 12 


Matheniatics 


В the performance о indivi 51 rou 
f the individuals in the 
ation to р group, а 


In effect, in rel 
he English examination is the equiv 
alent of a sc 
ore of 52 о 
n 


score of 65 on t 


4.11 


BASIC STATISTICS 


the mathematics examination. To illustrate, a score one standard deviation 
above the mean, that is, 65 + 8, or 73, on the English examination can be 
considered to be the equivalent of a score one standard deviation above the 
mean, that is, 52 + 12, or 64, on the mathematics examination. If an indi- 
vidual makes a score of 57 on the English examination and а score of 58 on 
the mathematics examination, we may compare his relative performance 
on the two subjects by comparing his standard scores. On English his stan- 
dard score is (57 — 65)/8 =—1.0, and on mathematics his standard score 
is (58 — 52)/12 = .5. Thus on English his performance is one standard 
deviation unit below the average, while on mathematics his performance is 
-5 standard deviation unit above the average. Quite clearly, this individual 
did much more poorly in English than in mathematics relative to the per- 
formance of the group of individuals taking the examinations, although this 
is not reflected in the original marks assigned. To attain rigorous com- 
parability of scores, the distributions of scores on the two tests should be 
identical in shape. The meaning of this statement will become clear as we 
proceed. 
The reader should note that the sum of squares of standard scores, 22°, 
is equal to N — 1, We observe that z? = (X— Х):/57; Вепсе 


LIU зы __ 
= Sa- Хуур е1 


The reader should note here that if s? 


is defined as У(Х — X)2/N, the 
sum of squares of standard scores is № a 


nd not N — 1. 


ADVANTAGES OF THE VARI 


ANCE AND STANDARD 
DEVIATION AS MEASURES 


OF VARIATION 


i Po nce and meaning of the 
variance and standard deviation in thei 


siderable familiarity with statistical ideas, 


4.12 


[4.12] 


[4.13] 


4.13 


[4.14] 


° Ж... ОТУ Үш) `9 7% 


MOMENTS ABOUT THE MEAN 


mean and the standard deviation are closely related to a family of 
escriptive statistics known as moments. The first four moments abo oe " 
и! е 


arithmetic mean are as follows: 
sj 


g 
. “a 
N ў 
тз = ха x e 
ee Хх): 3 
In general, the rth moment about the mean is given by 
iie У(Х x Xy З 


The term “moment” originates іп тес i i ^d 
ported by a fulcrum. If a force f, is applied а lever КШ 
the origin, then fix is called the moment of the force. Further. p к” 
force f; is applied at a distance xo, the total moment is fix din o: 
square the distances x, we obtain the second moment; if Wa И M 
obtain the third moment; and so on. When we come to consider Е Ме 
the origin is the analogue of ће fulcrum and the fre eno 
in the various class intervals are analogous to forces operating i. encies. 
distances from the origin. Observe that the first moment about the ie 
zero and the second moment is (N — 1)/N times the sample нае 
4 to obtain a measure of skewness, and the € 


distributions, 


third moment is use 
moment, a measure of kurtosis. 

` 

( 


EWNESS AND KURTOSIS 


MEASURES OF SK қ 
А commonly used measure of skewness may be obtai Е Е 
ined from the 4 
second 


and third moments and is defined as 


тз 


=_= 
81 mi^ [ma 


The rationale of this statistic is based on the observation that when a dis. 
tribution is symmetrical, the sum of cubes of deviations above кш а E 
will balance the sum of cubes of deviations below the mean. Thus и mean 
distribution is symmetrical, ms — 0 and g, = 0. If the distributio =. the ' 
long tail to the right, the sum of cubes of deviations above the me: nhas a 
the corresponding sum below the mean. Under гм » "n 
distribution is positively skewed and g; is positive. C е circum- | 
is negatively skewed, g, is negative. The к... 
oment 


greater than 


stances, the 
if the distribution 


PROBABILITY AND THE 
BINOMIAL DISTRIBUTION 


= 


5.1 INTRODUCTION 


der to one particular type of theo- 
distribution, The binomial dis- 
cal distributions which are used 
ities. Why are experimentalists 


of a number of theoreti 


are in probabilistic 
with’ certainty 
however small, 


al 


PROBABILITY AND THE BINOMIAL DISTRIBUTION т 


hypothesis that по difference exists between the two treatments, that one 
treatment is no better than the other. He then estimates the probability of 
obtaining by random sampling under this trial hypothesis a differ 72 
equal to ог greater than the one observed. If this probability is өші, 
the chances аге less than 5 in 100, һе may consider this sufficient evide к : 
for the rejection of the trial hypothesis and may be prepared to aser i x 
one method of treatment is better than the other. If the probability i к 
small and the observed difference may be expected to en 4 e 
frequently under the trial hypothesis, say the chances are 20 in 100. pu \ 
the evidence does not warrant the conclusion that one treatment i b we 
than the other. Eb у 
In general, the interpretation of the data of experiments is in probabi 3 
listic terms. The theory of probability is of the greatest importance $ scie к 
tific work where questions about the correspondences between the ded a { 
tive consequences of theory and observed data are raised. Probability» 
theory had its origins in games of chance. It has become basic to the think. р 
и. 


ing of the scientist. 


THE NATURE OF PROBABILITY 


Diverse views of the nature of probability may be entertained. The topic is - 
controversial. No inclusive summary of these different views will be 
attempted here. We shall discuss three approaches to probability: (1) the 
subjective, or personalistic, (2) the formal mathematical, and (3) the ешр 
cal relative-frequency approach. These different ways of regarding proba- 
bility are not incompatible. 7 
The term probability тау be used subjectively to refer to an attitude of 
doubt with respect to some future event. For example, the assertions ma P 
be made that “It will probably rain tomorrow," or “The probability is ndi 
that I shall live to be 90 years old," or “There is a high probability that a 
particular horse will win the Kentucky Derby." Frequently, numerical - 
terms are used in making assertions of this kind, such as, “Тһе odds аг 
even that it will rain tomorrow,” or “I estimate that the chances are about. 
95 in 100 that I shall die before I am 90 years old," or “Тһе chances are | 
three to one that a particular horse will win the Kentucky Derby." All suci 
assertions, whether numerical terms are used or not, refer to feelings of L 
degrees of doubt or confidence with regard to future outcomes. This sub 


jective usage is sometimes spoken of as psychological, or personalistic 
, 


probability. 
А second usage defines the probability of an event as the ratio of the 


number of favorable cases to the total number of equally likely cases. Thi: 
usage stems from a consideration of games of chance involving Garde die s 
and coins. For example, on examining the structure of a die the n e 
may be made that no basis exists for choosing one of the six alternativ on 
preference to another; consequently all six alternatives may be солы n : 
equally likely. The Dx cie of UM a particular result, say, a 3 pe | 
there being one favorable case among six equally lik 
ely 


i 


single toss is then 4, 


BASIC STATISTICS 


alternatives. This approach to probability involves a concept of equally 
likely cases, which has a degree of intuitive plausibility in relation to cards, 
dice, and coins. Difficulties present themselves, however, when we attempt 
to apply this approach in situations where it is impossible to delineate 
cases which can be construed to be equally likely. These difficulties have 
led to the argument that equally likely means the same as equally proba- 
ble; therefore the definition is circular because it defines probability in 
terms of itself. Arguments have been advanced to escape this circularity. 
These need not detain us. The difficulty, however, is readily resolved by 
observing that the concept of equally likely in this definition of probability 
is a formal postulate and is not empirical. In e 
postulate that certain events are equally likely, 
us deduce certain consequences." Thi 
employing this postulate is a formal 


ffect we say, "Let us 
and given this postulate let 
s means that a theory of probability 
mathematical model. It may or may not 
correspond to empirical events, It may be demonstrated, however, that this 
model does approximate closely certain empirical events and consequently 
is of value in dealing with practical problems. 
The situation here is somewhat analogous to that in ordinary Euclidian 
geometry. Euclidian geometry is a formal system comprised of a set of 
i » and their deductive consequences, called 
hold regardless of questions of corre- 
We know, however, on the basis of 
тетз can be shown to correspond 
consequence, Euclidian geometry 
with problems in engineering, sur- 
ny other fields. Both with Euclidian 
draw a clear distinction between the 
mpirical events for which the formal 


veying, building construction, and ma 
geometry and probability it is useful to 
formal mathematical system and the е 
system may serve as a model, 

A third approach t 
frequencies. If a serie М, and a given event occurs Г 
times, then r/N is the i 


» the subjective or per- 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 73 


relative frequencies, are not incom atible, and i i 4 
that all three must of necessity een ie. н ыы argued а 
probability may be an interesting topic of paychiological in и 
tical statistical work use is made mainly of the formal PME ‘eal ail 
relative-frequency approaches, the latter being the wei іні” 
ple- 


ment of the former. 


5.3 POSSIBLE OUTCOMES 
Questions of probability frequently involve a consideration of th 
of possible outcomes, sometimes spoken of as a set. For с ж. 
tossing а single coin, there are two possible outcomes—either а Wee 3 
tail will occur. In tossing two coins, the four possible outcomes ue HH 
HT, TH, TT. This means that both coins may be heads, the first a -—— 
the second a tail, the first a tail and the second a head, or both tail E 
tossing three coins, there are eight possible outcomes, which n i 
H, HHT, HTH, THH, HTT, THT, TTH, TTT. In s 
ber of possible outcomes is 16. In throwing a single did 
outcomes is six; that is, either 1, 2, 3, 4, 5, or 6 =. 
the number of possible outcomes may be 


represented as HH 
four coins, the num 
the number of possible 
appear. In throwing two dice, 
listed as follows: 


n a 3 я 61 
2 2 2 42 9 “62 
з з 3 4 5 63 
y y и и 5 64 
15 25 35 4 55 65 
16 26 36 46 56 66 


These are the numbers which may come up in throwing two dice. Both di 
may come up 1, the first may come up 2 and the second 1, the rat on 2 
second 1, and so on. Thus there are 36 possible outcomes to this the 
ment. In drawing а single card from a deck of 52 cards, the ны m 
sible outcomes is 52. In drawing one card from a deck of 52 peu 894 
another from а different deck, the number of possible outcomes is a 
52 = 2,704. : 
In the illustrative situations described above we may be willi 
assume that the possible outcomes are equally likely. With most ng to 
dice, and cards thisisa justifiable assumption, the validity of which coins, 
readily checked by experiment. Under this assumption the Ке be ` 
obtaining two heads, HH, in tossing two coins is Е wa ity of - 
r equally probable outcomes. We denote this оюну ДО 
bability of obtaining three tails in tossing three ty by 
being eight equally likely outcomes. The prob мы 
6’s in throwing two dice is p(6,6) = 25. This event ^ у of 
en to 


once in fou 


p(HH) = + The pro 


obtaining 190 
occur once in the 


5.4 


8 


BASIC STATISTICS 


The illustrative applications above regard a probability as the ratio of the 
number of favorable cases to the total number of equally likely cases. This 
approach may be extended to situations where the number of possible out- 
comes are not equally likely, provided a method exists for either deducing 
or estimating the expected relative frequencies of the possible outcomes. 
In rolling two dice, we may decide simply to sum the two numbers which 
appear and to regard this sum as an outcome. The set of all possible out- 
comes, thus defined, consists of the 11 numbers 2, 3, 4,... , 11, 12. 
These outcomes are not, of course, equally likely. Their expected relative 
frequencies may be obtained, however, from a consideration of the 36 pos- 
sible, equally likely, outcomes involved in rolling two dice. A 2 can occur 
only once in 36 equally likely outcomes; that is p(2) = %.А 3 can occur 
twice; that is, p(3) = $. Likewise p(4) = зс. P(5) = Ж, and so on. 
Thus the possible outco 


JOINT AND CONDITION 
Consider a population of 
istics 4 and B, there bein 
Strata, of B, as follows: 


AL PROBABILITI ES 


members classified with regard to two character- 
g three classes, or strata, of 4 and two classes, or 


; that is, p(B,) = .60. The proba- 


bility that he is A, is 
both 4, and В, is -40; that is i ility i 
Joint probability. It is the probabili baee уы 


bility that a 


member will fall simultaneously 
probabilitie: 


$ in the above table are joint 


Now given the fact that a member is В,, what is the probability that he is 
А, Az, or Аз? To answer this question we divide the joint probabilities 
p(A,B,) = 40, p(4)B,) = 15, ала P(43B,) —.05 by the probability 
D (Bi) = .60 to obtain p(4,/B,) = 67,р(4,/В,) = „95 and p(4,/B,) = .08. 
These are conditional probabilities, They are the probabilities of дь 


я 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 75 


and A, given В. Note that the sum of these three probabilities is 1.00. Like- | 
wise, if a member is B what is the probability that he is 41, 45, or 45? Неге 
р(А\В„) = .00, р(4/В,) = .25, and р(4/В:) = .15. These conditional 
probabilities may be written in tabular form as follows: E 


Likewise, the conditional probabilities of B given 4 are 


1.00 1.00 1.00 


THE ADDITION AND MULTIPLICATION OF PROBABILITIES 


In throwing a die six possible events may occur. If we are prepared to 
assume, as in the formal mathematical approach to probability, that these 
six events are equally likely, then the probability of obtaining a 1, 2, 3, 4, 5, 
or 6 ina single throw is 4, the ratio of the number of favorable cases to the - 
number of equally likely cases. Consider now the probability of obtaining. 
either a 1, 2, or 3 in a single throw. Since there are now three favorable 
cases among six equally likely cases, this probability is readily observed to 
be #+ 4+ #= 5 This is an application of the addition theorem of probabil- 
ity. This theorem states that the probability that any one of a number oj 
mutually exclusive events will occur is the sum of the probabilities of the 
separate events. «Mutually exclusive” means that if one event occurs, the 
others cannot. To illustrate further, in tossing two coins four possible 
events may occur. Both coins may be heads, both may be tails, the first 
may be a head and the second a tail, or the first may be a tail and the 
second a head. These events exhaust the possible outcomes. They may be 
represented as HH, TT, HT, TH. Again, if we assume these four events to 
be equally likely. the probability of any one of the four events is }. By the 
addition theorem the probability of either two heads or two tails, that is, 
HH or TT, is і-іт% 

In throwing two dice the number of possible outcomes is 36, and th r 
probability of any particular outcome, assuming these to be equally likely, | 
is зу, which is the product of the two independent probabilities, or $ x 4, 
This is an application of the multiplication theorem of probability, "This 
theorem states that the probability of the joint occurrence of two or more. 
mutually independent events 15 the product of their separate probabilities 


5.6 


BASIC STATISTICS 


By mutually independent is meant that the occurrence of one event does 
not affect the occurrence of the other events. To illustrate, the probability 
of obtaining four heads in four tosses of a coin is іхіхіхіті The prob- 
ability of drawing the ace, king, and queen of spades in that order in 
drawing one card from each of three well-shuffled decks of 52 cards is ў X 
sz X gz = 1/140,608. The probability of drawing the ace, king, and queen 
of spades in that order, and without replacement, from a single deck of 52 
cards is zz X уг X ss, ог 1/132,600. The probability that the first card is 
the ace of spades is d. Having drawn one card; 51 cards remain, and the 
probability that the second card is the king of spades is ът. Similarly, the 
probability that the third card is the queen of spades is 35. The probability 
of the combined event is the product of the separate probabilities. 


PROBABILITY DISTRIBUTIONS 


pt of a frequency distribution was dis- 
was defined as an arrangement of the 


Class interval у 

SSS 
40-49 5 
30-39 
20-29 
10-19 

0-9 
Total 


мю ою 
5585 


Class interval p 


40-49 .10 
30-39 .20 
20-29 .40 
10-19 .20 
0-9 2.0 
Total 1.00 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 77 


The probability for the interval 40 to 49 is .10. This means that if a member 
were selected at random from the original group of 50 members the probas 
bility is .10 that he would fall within the interval 40 to 49. Similarly the | 
probability that the member would fall within the interval 30 to 39 is .20 * 
and so on. Such an arrangement is called a probability distribution. A 
probability distribution specifies the probabilities associated with particu- · 
lar values of the variable or the probabilities associated, where class | 
intervals are used, with defined ranges of the variable. : 

Probability distributions can be represented graphically by plotting 
probabilities along the vertical axes, instead of frequencies, as in the case | 
of frequency distributions. қ 

In sections to follow, а particular type of theoretical probability distribu- | 
tion, known as the binomial distribution, is discussed. Before proceeding | 
with a discussion of the binomial distribution it will prove useful to review 
the reader’s knowledge of permutations and combinations. 


PERMUTATIONS AND COMBINATIONS 
A knowledge of permutations and combinations is useful in dealing with 
many problems involving probabilities. 

Consider two objects labeled А and В. Two arrangements are possible, . 
AB and BA. With three objects labeled 4, B, and C, six arrangements are 
possible. These are АВС, АСВ, ВАС, BCA, CAB, and CBA. These arrange- 
ments are called permutations. In general, if there are n distinguishable 
objects, the number of permutations of these objects taken п at a time is 
given by nl, orn factorial, which is the product of all integers from n to 1, or 
n(n—D(n-2): -.3x2x1.Forn-3,n! 2 3x2X1-—6. Forn = 5, 
nl-5xX4x8X2x1- 120. Consider the number of seating arrange- 
ments of eight guests in eight chairs at a dinner table. The first guest may 
sit in any one of eight chairs. When the first guest is seated, the second 
guest may sit in any one of the remaining seven chairs. Thus the number of 
possible arrangements for the first two guests is 8 X 7 — 56. When the first 
two guests are seated, the third guest may occupy any one of the remaining 
six chairs, and so on for the remaining guests. The number of possible 
seating arrangements for the eight guests is 8!, or 40,320, a number which 
may help explain the indecision of many hostesses. 

Instead of considering the number of ways of arranging n things n at a 
onsider the number of ways of arranging n things r at a time 
where r is less than 7. Thus the possible arrangements of the objects 4, B. 
and C taken two at à time are АВ, AC, ВА, ВС, CA, and CB. Неге ча 
observe that there are three ways of selecting the first object and two ways 
of selecting the second. The number of arrangements is then 3X 2— 6. 
Similarly, on considering the number of arrangements of 10 objects taken 
three at a time, We observe that there are 10 ways of selecting the first, 9 
ways of selecting the second, 8 ways of selecting the third. The number of 
arrangements is then 10 X 9368 = 120. In general, the number of permuta- 


time, we may © 


[5.1] 


[5.2] 


! 


BASIC STATISTICS 


tions of n things taken r at a time is 


n! 


Рл-а(а-1)... фе i 77 


The number of different ways of selecting objects from a set, ignoring 
order in which they are arranged, is the number of combinations. Given t S 
objects 4, B, C, and D, the number of permutations of two from this = 
4 X 3 = 12. The arrangements are АВ, BA, AC, CA, AD, DA, BC, CB, > 


The number of со 


THE BINOMIAL DISTRIBUTION 


In tossing 10 coins wh 
heads? We are required 


ent ways. 4 may be a head 
tails, B may be a head and all the others tails 


» and so оп. Since one head can 
occur in 10 different ways, the Probability of obtain; 
tails is 10)" = 10/1,024. Thus in tossing 10 coins there are 10 chances in 
1,024 of obtaining one head and nine tails, 


[5.3] 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 79 
D 


Determining the probability of obtaining two heads and eight tails may | 
be similarly approached. The probability that coins A and В are heads is. 
(9% The probability that all the remaining coins are tails is (9% The proba- | 
bility that Æ and В are heads and all the remaining coins are tails is (3). 
We readily observe, however, that two heads can occur in quite a numba 
of different ways. This number is the number of combinations of ten things - 
taken two at a time, Co", which is 10 x 9/2 = 45. Therefore the probabilit 
of obtaining two heads and eight tails is 45(3)", ог 45/1,024. Similarly, th 
probability of obtaining three heads and seven tails is C4!0(3)!? = 120/1.024. - 
Likewise, the probability of obtaining four heads and six tails is C, = 
210/1,024; and so on. The probabilities of obtaining different numbers of | 
heads in tossing 10 coins are then as follows: a 


No. of 
heads Probability 


ннен MÀ 
1/1,024 
10/1,024 
45/1,024 
120/1,024 
210/1,024 
252/1,024 
210/1,024 
120/1,024 
45/1,024 
10/1,024 
1/1,024 


o-—wowoeuo-ucoot 


The above probabilities are the successive terms in the expansion of the 


mmetrical binomial ($+ 4". This expansion is 
10x9 
1х2 (9% 


10x9x8 э 
rosea: ORRE ФЕН 


зу 
(42-3) = (09 + 10) + 


at 


This is a particular case of the symmetrical binomial ($ + 2)" whose terms | 


are 
gape чти ar 


n(n—1)(n—2) syn A 
*'ixzxs (1000 
The symmetrical binomial isa particular case of the general form of the р 
binomial (р + 4)", where р is the probability that an event will occur and q 
is the probability that it will not occur; that is, p +q = 1. This binomial | 


may be written as follows: 
Y 


80 


(5.41 


[5.5] 


BASIC STATISTICS 


(p + 4)" = p" + np™"'q + 


m(n—1)(n—=2) yg T 
*UX2x3 ДЕН) 


The terms of this expansion for n — 2, n — 3, and п = 4 are as follows: 


(р + q)? = р? + 2pq + а? ree 
(р +q)? = р? + 3p*q + Зра? + 4 
(р + а)* = рї + 4p%q + брз + 4pq? + фі 


The binomial (p +)" can be readily illustrated by considering a 
problem involving the rolling of dice. What are the probabilities of 
obtaining five, four, three, two. one, and zero 6's in rolling five dice 
presumed to be unbiased. The probability of obtaining a6 in rolling a single 
die is 4, whereas the probability of not obtaining a 6 is È. The required prob- 


abilities are given by the six terms of the binomial (4+ 8%. These terms 
are 


5 — 5 4 5x4 3 2 
(E+ 8)5 = (4)5 + 5(a)4(8) + То D 
5004. дз + d» 
*t1xga (70) + 500) (B+ (8 


Thus the probabilities of obtaining five, four, 


three, two, one, and zero 6's 
in rolling five dice are calculated as 


No. of 

6% Probability 

_ Lc Jd 
5 1/7,776 
4 25/7,776 
3 250/7,776 
2 1,250/7,776 
1 3,125/7,776 
0 3,125/7,776 

a MÀ: 


= п! 
Ср’ r= Я @ = 5T p'q"-r 


where C," is the number of combinations of n things taken rat a time, Thus 
the probability of obtaining tl 


пгее heads in 10 tosses of a coin is 


10! 120 


aoa O= 


1,024 


Table 5.1 


я 
2 


[5.6] 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 81 


The coefficients C,” in any expansion are 


n(n—1) пп 0092) 


1: А 
1х2 1х2х3 odo 


These coefficients may be rapidly obtained for different values of N from 


what is known as Pascal's triangle. The coefficients for different values of 
i i ; : о 
№ are written in rows in the form of a triangle as shown in Table 5.1. Th 
.1. The 


Pascal's triangle 
` 


0 

1 11 

2 y 2 1 

3 t % 9,1 

4 14641 , 
5 1 5 1010 5 1 

6 1 6 15 20 15 6 1! 

T 1 7 21 35 35 21 7 1 

8 1 8 28 56 70 56 28 8 1 \ 
9 1 9 36 84 126 126 84 36 9 1 А 
10 1 10 4 120 210 252 210 120 45 10 1 


cg ына с. ae 


number in any row is the sum of the two numbers to the left and right on | 
the row above. This device is very useful in generating expected 
frequencies and probabilities. For example, for V = 10, the entries in th 
triangle are the expected frequencies of heads, or tails, in tossing 10 wee 
1,024 times. The required probabilities in this case are obtained by dividing 


the frequencies by 1.024. 


PROPERTIES OF THE BINOMIAL 


For the symmetrical binomial, where p =q 


skewness, and kurtosis are 


‚ the mean, variance, 


ш= п/2 
о? = п/4 
у =0 

ys =—2[п 


Here we изе the symbols x. о?, у, and у» instead of X, 5°, ду, and 
because the binomial is a theoretical distribution. и, 0°, у, and уз ma 
viewed as population parameters rather than sample estimates. The = e> 
formulas may be illustrated by considering the tossing of five wats 8% 


82 


zm i Е" 


‚чл 


[5.7] 


BASIC STATISTICS 


times. The expected frequencies of zero, one, two, three, four, and five 
heads are 1, 5, 10, 10, 5, 1. These frequencies are the coefficients of the 
expansion (3 + 25, Using the formula Шш = n[2, the mean is „= 2:5. This is 
the expected mean number of heads in tossing five unbiased coins 32 
times. The variance of the distribution is o? = n/4 = $ = 1.25. Because the 
distribution is symmetrical, the measure of skewness y, = 0. The measure 
of kurtosis y, — —2/n = —2 = — 40. Note that аз п increases in size, Уг 
Бесотез зтаПег. Аз п іпсгеаѕеѕ іп size, уз approaches zero as а limit. 

The above values may be obtained by direct calculation. Denote the 


number of heads by X, the frequencies by f, and a deviation from the mean 
of X by x. 


X f SX x fe for ДА 


5 1 


5 2.50 6.25 15.625 39.0625 


4 5 20 1.50 11.25 16.875 25.3125 

3 10 30 .50 2.50 1.250 -6250 

2 10 20 —.50 2.50 —1.250 -6250 

1 5 5 —1.50 11.25 —16.875 25.3125 

0 1 0 =2:50. 6.25 —15.625 39.0625 
Total 32 80 


40.00 -000 130.0000 

The arithmetic mean, и, of this distribution is £EfX|N = $2 = 2.50. The 
variance, o?, is УРМ = 49 = 1.25. о? is here defined as У fx?|N, and not 
ZXfel(N — 1). Here we are dealing with a theoretical mo 
sample value. The skewness у, is readily seen to be zero because the third 


moment 2fx3/N is zero. The fourth moment is ХАМ = t= 4,0695 and 
the kurtosis is Уг = 4.0625/1.25? — 3 = —-40. Here N denotes the number 
of tosses, or observations, whe 


del, and not a 


Ш = np 

o° = пра 

ай 
"Упра 
—1- бра 

qe npq 


In tossing an unbiased die, P of throwing a 6 is $ апа the 
probability 4 of not throwing a 6 is & The ex 
of 6's in tossing 10 dice is gi 


“61,067. ‘The -varlance: 1s 
a° = пра = 10 X X $ = 1.389. The skewness y, = .566 and the kurtosis 
» = .12. 


5.10 


PROBABILITY AND THE BINOMIAL DISTRIBUTION 83- 4 


A HYPOTHETICAL EXPERIMENT 


The binomial distribution is frequently used as а model in evaluating | 
experimental results. Such uses of the binomial may be illustrated with ref- _ 
erence to a hypothetical experiment. К 

Ап individual asserts that he has certain psychic powers which enable қ 
him to predict the outcome of future events. An experiment is arranged 4 
involving the tossing of a coin. The individual is required to predict the | 
outcome in 10 tosses. If we operate on the working hypothesis that the indi- 
vidual possesses no powers of the type claimed, the probability of a correct | 
prediction by chance alone in a single toss of the coin is $. From the bino- | 
mial expansion (+ 4) we can ascertain the probabilities of different | 
numbers of correct predictions. Thus the probability of the individual suc- / 
cessfully predicting the outcome in all 10 trials by chance alone is 1/1,024, 
or .00098. The probability of nine successful predictions and one failure is 
10/1,024, or .00977. The probability of eight successful predictions and two қ 
failures is 45/1,024, ог .04395, and so on. The probability of nine or more + 
successful predictions is 100977 + .00098 = .01075, and the probability of | 
eight or more successful predictions is .04395 + .00977 + .00098 = .05470. ЕС 
Now clearly, before undertaking the experiment, some agreement must be — 
reached regarding the number of correct predictions we are prepared to | 
dence for the rejection of the hypothesis that the individual Ха 
possesses по powers of the type claimed. 

We may agree arbitrarily that if the results obtained in the experiment 
could have occurred by chance with a small probability only, say, equal to” 
or less than .05, then these results would be accepted as at least not incom- | 
patible with the claims for psychic powers. We observe that the probability — 
of eight or more correct predictions by chance alone is .05470. This is. 
greater than the .05 probability we have agreed to accept; consequently 5, 
eight correct predictions would in this case not be considered sufficient evi- | 
dence. The only possibilities here which would prove acceptable within the | 
criterion adopted are nine or ten correct predictions. х 

The experiment is conducted; seven correct predictions and three fail- | 
ures are obtained . 

% 


ассері ав еуі 


. The probability of seven or more correct predictions 
ce alone in ten trials may be calculated from the binomial 
176/1,024, ог 117189. Thus there are about 17 chances in | 
sult by ordinary guessing equal to or better than the 


occurring by chan 
distribution and is 


100 of obtaining а те 
one observed. In consequence, the experimental results provide no accept- 


able basis for rejecting the working hypothesis that the individual pos- К; 
sesses по powers of the type claimed. 
Let us suppose tapa шыл had made 10 correct predictions. 1 
Соша ме reasonably argue from [В result that the individual in question | 
did in fact possess psychic power? Quite clearly, such a result is посо 
patible with the assertion of psychic power and provides no basis ТА 
rejecting that assertion. We observe; however, that circumstances other 
than the possession of psychic power may possibly have led to the result 


; 
К 
E 


EXERCISES 1 In rolling a die, 


BASIC STATISTICS 


obtained; that is, alternative explanations of the results may be ques 
In experimental situations of the type described we would мени 
require more than 10 trials. Let us suppose that 1.000 trials had been ma 
and 550 correct predictions obtained. The probabilities required to evalu- 
ate this result would then be generated by the expansion (} + Әумин; Quite 
clearly, the calculation of the required probabilities directly from the bino- 
mial would involve almost prohibitive arithmetical labor. Fortunately, a 
very close approximation to the required probabilities can be readily 


obtained from the normal probability distribution, which we shall now 
consider. 


x E 


? 
what is the probability of obtaining either a 5 or a 6? 


2 In rolling two dice on one occasion, what is the probability of 
obtaining either a 7 or an 11? Note that there are 36 possible 
outcomes, 

3 Inrolling two dice, what i 


s the probability that neither a 2 nor a 9 will 
appear? 


4 In rolling two dice, what is the probability of obtaining a value less 
than 6? 


5 On four consecuti 


ve rolls of a die a 6 is obtained. What is the proba- 
bility of obtaining 


a 6 on the fifth roll? 


a placement from a well-shuffled deck, 
what is the Probability of obtaining four aces? 
An urn contains four black 


without replacement, 
BWBWBWB? 


hite balls. If they are drawn 
the Probability of the order 


8 In seating eight people at a table w 


ith eight i hat is the 
number of possib] ght chairs, wha 


е seating arrangements? 
9 A how many Ways can two people Seat themselves at a table with four 
chairs? 


10 In tossing five coins what ів th ili 
ле ini en 
ines , probability of obtaining fewer th 
11 A multiple choice tests conta; i 
* contains 100 quest ion has 
five alternatives. If a д да Еді аншы 


‚э Student guesses all questions, what score might 
he expect to obtain? 


12 In how many ways сап а committee of three be chosen from a group 
of five men? 


13 


14 


15 


16 


17 


18 
19 


20 


- ei 
PROBABILITY AND THE BINOMIAL DISTRIBUTION 85 3 
Assume that intelligence and honesty are independent. If 10 per cent | 
of a population is intelligent and 60 per cent is honest, what is the 
probability that an individual selected at random is both intelligent 
and honest? 


A husband engages in random verbal behavior for 20 min in an hour. 
His wife is similarly engaged for 30 min in that hour. Neither listens to : 
the other. Estimate the number of minutes of silence in the hour. 


E 


А coin is tossed 10 times. What is the probability that the third head 
will appear on the tenth toss? p 
The chances are three out of four that the weather tomorrow will be 
like the weather today. Today is Sunday and is a sunny day. What is 
the probability that it will be sunny all week? What is the probability 
that it will be sunny until Wednesday and not sunny on Thursday, 
Friday, and Saturday? 


What is the expected distribution of heads in tossing six coins 64 . i 


times? 

What is the expected distribution of 6’s in rolling six dice 64 times? 

What is the probability of obtaining either nine or more heads or three 

or fewer heads in tossing 12 coins? : 
1 


tosses six coins and rolls six dice simultaneously. What is the 


А man 
e heads and five or more 6's? 


probability of five or mor 


6 


THE NORMAL CURVE 


6.1 INTRODUCTION 


6.2 


In Chap. 5 the binomial distribution was discussed in detail. The symmet- 


was used to illustrate the binomial expansion. 
Instead of considering (4+ 4)" we might consider the more general form 
(2+ 3)". As ni 


size this distribution will approach a continuous 


> " Eu 
я 
THE NORMAL CURVE 87 я 


the length of the edge. Consider the equation Y = bX + a. This is a linear 
function. It is the equation for a straight line; У and X are variables; b and a 
are constants. If b and a are known, different values of X can be substi- 
tuted in the equation and the corresponding values of Y obtained. If the | 
paired values of Y and X are plotted on graph paper, Y on the vertical and X 
on the horizontal axis, a straight line results. Y and X bear a functional rela- | 
tion to each other, and this relation is linear. Y is sometimes spoken ofas + 
the dependent and X the independent variable. A functional relation may - 
be written in the general form Y =/(X). This simply states that Y is some 
function of X. Here the nature of the function is not specified. 

Consider now the binomial (p + q)". The terms in the binomial expan- 
sion are the expected relative frequencies or probabilities associated with | 
particular events. Inspection of formula [5.4] indicates that any term in 


the binomial expansion is given by 


n-r 


[6.1] р, = Сира ` 
he rth event. This expression is a function. ( 
different values of г may be substituted on | 
the right and the corresponding values of p, obtained. Here p; is the depen- \ 
dent and г the independent variable. The variable r is restricted to the | 
п + 1 values 0,1,2,...›.,п; consequently p, is also restricted to a fixed 
number of possible values. The paired values of pr and r may be plotted on 
graph paper, Pr on the vertical and r on the horizontal axis. The resulting | 1 
graph is a visual description of the functional relation between the event r 
and its relative frequency or probability pr- 1 
In the binomial the variable r is discrete and not continuous. In tossing | 
50 coins, for example, the number of heads or tails obtained is a discrete | 
number. The value of p, changes from г to r+ 1 by discrete steps. We | 
observe, however, that as n increases іп size we obtain a larger and larger | 
number of graduations of the distribution and by increasing the size of n we 
ake the graduations as fine as we like. By considering the situation | 
ndefinitely large, that is, п approaches infinity, we arrive 
f a continuous frequency curve or function. This curve 


f the binomial. 


where p, is the probability of tl 
For fixed values of n, p. and q, 


сап m 
where n becomes 1 


at the conception 0 


is the limiting form o ial. 
Frequency curves are in certain instances conceptualized as extending 


along the X axis from minus infinity to plus infinity; that is, the curves taper 
off to zero at the two extremities. Although this is so, the 8558 between the 
curve and the horizontal axis is always finite. For convenience this area is 

unity. 
E orient it NM necessary to find the proportion of the total area 
of the curve between ordinates erected at particular values of X, that is, 
beiren i= and X — b as shown in Fig. 6.1. This proportion is the prob- 
ability that a particular value of X drawn at random from the population 
which the curve describes falls between a and b. Because of this, 
frequency curves are often referred to as probability curves or probability 
distributions. Statisticians use а variety of theoretical frequency curves as 
models. The normal curve is one of the more important of these. 


Fig. 6.1 


6.3 


[6.2] 


[6.3] 


BASIC STATISTICS 


x 


Frequency curve showing area between X-— a and X =, 


THE NORMAL CURVE 


Yu 


- VE en (XH /202 
where У = height of curve for Particular 
т = а constant = 3.1416 
е = base of Napierian logarithms = 2.7183 


N= number of cases, which means that the total area under the 
curve is № 


№ and с = mean and standard deviation of the distribution, respectively 


We have used the notation u and с in this fo; 
and standard deviation, instead of X and 5, 


values of X 


rmula to represent the mean 
because the formula is a theo- 


written in Standard-score form. Standard 
а standard deviation of 1, Thus д = 0 and 
9 = 1. The area under the curve is taken as unity; that is, М=1. With 
rite 

=f 
g= Ут 


Here 2 is а standard score on X and is equal to (X — ia: The scores i 
deviation in standard deviation units measured along the base line of the 
curve from a mean of zero, deviations to the right o 


; f the mean being posi- 
tive and those to the left negative. The curve has unit area and unit stan- 


e 7m 


ч 


THE NORMAL CURVE 89 


х/с or 3 


Fig. 6.2 Normal curve showing height of the ordinate at different val- 


ues of х/0, or z. 


dard deviation. By substituting different values of z in the above formula, 
different values of y may be calculated. When z = 0, y = 1/V 27 = .3989. 
This follows from the fact that e? = 1. Any term raised to the zero power is 
equal to 1. Thus the height of the ordinate at the mean of the normal curve 


in standard-score form is given by the number .3989. For 2= +1, | 


y = .2420, and for z = +2, y = .0540. Similarly, the height of the curve may 
be calculated for any value of z. In practice the student is not required to 
substitute different values of z in the normal-curve formula and solve for y 
to obtain the height of the required ordinate. These values may be obtained 
from Table A of the Appendix. This table shows different values of y corre- 
sponding to different values of z. It also shows the area of the curve falling 
between the ordinates at the mean and different values of z. 

The general shape of the normal curve can be observed by inspection of 
Fig. 6.2. The curve is symmetrical. It is asymptotic at the extremities; that 
is, it approaches but never reaches the horizontal axis. It can be said to 
extend from minus infinity to plus infinity. The area under the curve is 


finite. 


AREAS UNDER THE NORMAL CURVE 

For many purposes it is necessary to ascertain the proportion of the area 
under the normal curve between ordinates at different points on the base 
line. We may wish to know (1) the proportion of the area under the curve 
between an ordinate at the mean and an ordinate at any specified point 
either above or below the mean, (2) the proportion of the total area above or 
below an ordinate at any point on the base line, (3) the proportion of the 
area falling between ordinates at any two points on the base line. 

Table A of the Appendix shows the proportion of the area between the 
mean of the unit normal curve and ordinates extending from z= 0 to z = 3, 
Let us suppose that we wish to find the area under the curve between the 
ordinates at z= 0 and z=+1. We note from Table A that this area is ЕТІН 


30 


Fig. 6.3 Normal curve showing areas 


Fig. 6.4 


BASIC STATISTICS 


0 +1 +3 
x/0 ors 


between ordinates at different 
values of x/c, or =, 


Proportion of the area between z — 0 


en z — *2is .4772 + .4772 = .9544, 
etween 2 = +3 is 49865 + .49865 = 


9 .50 +1 1.50 +2 


+3 
x/9 ors 


Normal curve showing area between or 


dinates at == .50 and 
2-1.50. 


х/т or = 


Fig. 6.5 
Normal curve showing values of z which includ: 
which is .95 of the total area. — 


below any point on the base line 
of the curve. For i Е 
Бес = 1. The proportion of the area between анан om let the роды 
The proportion of the area below the mean is .5000 cinis z= i is .3413.. 
total area below z= 1 is .5000 + .3413 = 8413. Тһе ха ia anm of the 
point is 1.0000 — 8413 = .1587. Similarly, the snos aai above this) 
or below any point on the base line can be readily ascert а ES 
Consider the problem of finding the area between са % 
points on the base line. Let us assume that we require us at any two 
z= „5 and 2 = 1.5. From Table A of the Apperdix we note fiat th between 
tion of the area between the mean and 2 = .50 is .1915. W at шергор 
the area between the mean and z = 1.5 is .4332. The xoig note also that | 
tained by subtracting one area from Я "n" 
er and is 


and == 1.5 is ob 
.4332 — .1915 = .2417. The area for any other segment of the cu 
ы rve may be - 


similarly obtained. 
On oceasion we wish to find values of z which i 
proportion of the total area. For example, the Me ак some specified - 
the mean which include a proportion .95 of the area m = above and below 
select a value of 2 above the mean which includes a Bl be required, Wig 
total area and à value of z below the mean which ны .475 of the 
475 of the total area. From Table А of the а udes а proportion 
proportion ‚475 of the area falls between z=0 and we bbsérve that the. 
curve is symmetrical the proportion .475 of the area talle Бу бзш ШЫ 
—1.96. Thus а proportion .95, or 95 per cent ea =(Oand 
е limits 2= 1:96. Also a proportion 05 bes E total area falls. 
s. Similarly, it may be shown that 99 x a cent, falls 
n, and 1 per cent outside, the limits wide the area 
howing values of z which include a Pune SeS : 
lon .95 of 


z= 
within th 
outside these limit 
of the curve falls withi 
6.5isa normal curve $ 


the total area. 
К, 


92 


6.6 


BASIC STATISTICS 


AREAS UNDER THE NORMAL CURVE— 
ILLUSTRATIVE EXAMPLE 


formed to 15 x (—.675) + 100 = 89.88 


ution is continuous, and 
he exact limits 6.5 to 7.5- 


е value 6.5 is equivalent 


he proportion of the area of the normal 


more heads in tossin 
Е де й © may compare this with the 
exact probabilities obtained directly from the binomial ex е this ^n 
pansion shown 


THE NORMAL CURVE 93 


Table 6.1 Comparison of binomial probabilities with cor- 
responding normal approximations forn=10 


and p= 1/2 
No. of Exact binomial Normal 
heads probability approximation 
10 001 .002 
9 010 .011 
8 044 044 
7 117 114 
6 205 205 
5 246 248 
4 205 .205 
3 111 114 : 
2 044 044 Y 
1 .010 011 
0 _.001 .002 
Total 1.000 1.000 
а: 


bility is .172. Here we note that the discrepancy 


Table 6.1. This probal 
btained from the normal curve and the exact bino- 


between the estimate о 


mial probability is trivial. 
Table 6.1 compares the binomial and normal probabilities for n — 10 and 


p=} We note that in this instance the differences between the exact bin 
mial probabilities and the corresponding normal approximations are «sal 

The accuracy of the approximation depends both on n and p; as : 
increases in size the accuracy of the approximation is improved. For an n 
as p departs from } the approximation becomes less accurate. yn 


6.7 SUMMARY оғ PROPERTIES OF THE NORMAL CURVE 


The following is а summary of properties of the normal curve. 


The curve is symmetrical. The mean, median, and mode coincide. 


ordinate of the curve occurs at the mean, that is, whe: 
, , re. 


2 Themaximum 
he unit normal curve is equal to .3989. 


z=0, and int 
asymptotic. It approaches but does not meet the hori 
ds from minus infinity to plus infinity. rizon- 


3 The curve is 
tal axis and exten 


nts of inflection of the curve occur at points +1 standard devi 
evia- 
d below the mean. Thus the curve changes ae 
in relation to the horizontal axis at these р ne 
oints, 


4 The poi 
tion unit above an 


convex to concave 


94 


EXERCISES 


BASIC STATISTICS 


5 Roughly 68 per cent of the area of the curve falls within the limits +1 
standard deviation unit from the mean 


6 Inthe unit normal curve the limits z = +1.96 include 95 per cent and 
the limits z = +2.58 include 99 per cent of the total area of the curve, 5 


per cent and 1 per cent of the area, respectively, falling beyond these 
limits. 


аа 


1 Find the height of the ordinate of the normal с 
values: —2.15, —1.53, +.07, +.99, 4-2.76. 


2 Consider a normally 
N = 200 find the hei 
35, 49,57, and 63. 


urve at the following 2 


distributed variable with X = 50 and s = 10. For 
ght of the ordinates at the following values of X: 25, 


People with IQ's (a) above 135, (b) abov. 
75 and 125, 


use a letter-grade system 4, B, C, D, and 
-40, .20, and .10 in the five 


E with the roporti 10, .20, 
grades, respec Proportions .10, 
for the five letter grades, 


tively. Find the Score intervals 


A > Р Бы) à 
THE NORMAL CURVE 


9 The following are data for test scores for two age groups: 


ll-year 14-year 
group group 


Assuming normality, estimate how many of the 11-year-olds do be 
than the average 14-year-old and how many of the 14-year-olds 
worse than the average 11-year-old. 2 


7.1 


CORRELATION 


INTRODUCTION 


paired with another observation for vw 
member of the group. The study of this type of data has two closely relate 
aspects, correlation and Prediction, Correlation 


CORRELATION 97 _ 


a 
Mediocrity in Hereditary Stature. Galton was interested in predicting the 
physical characteristics of offspring from a knowledge of the physical char- 
acteristics of their parents: He observed, for example, that the offspring of - 
tall parents tended on the average to be shorter than their parents, whereas | 
the offspring of short parents tended on the average to be taller than their | 
parents. He used the word "regression" to refer to this effect. In modern 
statistics the term regression no longer has the biological implication | 
assigned to it by Galton. In general, regression has to do with the prediction | 
of one variable from a knowledge of another. Karl Pearson extended 
Galton's ideas of regression and developed the methods of correlation 
extensively used today. 
The most widely used measure of correlation is the Pearson | 
t-moment correlation coefficient. This measure is used where the 
of the interval or ratio type. Other 
varieties of correlation have been developed for use with nominal and | 
ordinal variables. One measure commonly used to describe the rela-^- 
tionship between two nominal variables is the contingency coefficient. | 
Methods used with ordinal variables are called rank-order correlation | 
methods. These special types of correlation will be discussed in later 
chapters. 
In this chapter we shall present a discussion of correlation and proceed 
in Chap. 8 toa discussion of prediction and its relation to correlation. The 
reader will bear in mind that correlation and prediction are two closely 
related topics. Certain topics pertaining to the interpretation of the correla- 
tion coefficient and assumptions underlying its use can only be discussed 


following a consideration of prediction. 


produc 
variables are quantitative, that is, 


$ BETWEEN PAIRED OBSERVATIONS 
mbers. Denote these by An Azn | 
ble on each member on two vari- 


RELATION 


Consider a group comprised of № me 
. , Ay. Measurements аге availa 


+ X and У. The data may be represented symbolically as follows: | 
Measurement 
Members == 
x Y 
НЕ Е 
A А, Y , 
А, X; Y, 
Аз Xs Үз 
ne 


98 


BASIC STATISTICS 


Let us assume that measurements have been arranged in order of magni- 
tude on X extending from X, the highest, to Ху, the lowest, measurement. 
Given this arrangement on X, we may consider the possible arrangements 
of Y with respect to X. Consider an arrangement where the values of Y are 


» and so on, until the member who is lowest 
presents the maximum positive 
der now an arrangement where 
he lowest and Y, is the highest. 
ston Y, the member who is next 


on X is highest on Y. This Situation represer 
tion between the variables. Consider a situ 
Y is strictly random in relation to X, Value. 


of independence. The two sets 
each other. Under this arrange 


Үз ing 
between Х and У. Between the two extreme arrangements, representing 


the maximum Positive and negative relation, we тау consider arrange- 
ments which represent varying degrees of rela 


tion in either a positive or 
negative direction. To illustrate, 


let us assume that the values of X for the 
^ and A; are the integers 5, 4, 3, 2, and 1. If the 
е integers and are also arranged in the order 5, 4, 3, 
ximum positive relation, If values of Y are arranged 
2, and 1, we have clearly a high positive relation, 
St possible, If values of Yare arranged in the order 1, 
а maximum negative relation. Again an arrangement 
‚ ® 4, 3, 5 would be high negative, although not the 

Relations of the ki 
Paired measuremen 
represented by a p 


nd described above may be examined by plotting the 
ts on graph paper, each Pair of observations being 
oint. Such a plotting of measurements is sometimes 
called а Scatter diagram. Inspection of a Scatter diagram yields an intuitive 
appreciation of the degree of relation between the two variables. Figure 7.1 
shows four such diagrams. | 

Figure 7.1a is a graphi 
that the points fall very с 


е and a perf, 


nitely large number of Possible arrang, 


DN . ° “ 
24. ° $ * 
JU sx . . 
Y Y . © 
° es 
E E 
x x 
(a) (b) 
wo" Жы 9 е 
d, ^ easg" 
e ° ° 
yl. T | TN 
E . > А 
* “т < . ў 
. 
х х 
(с) (a) 1 
\ 


Fig. 7.1 (a) High positive correlation. (b) Low positive correlation. . 
(c) Zero correlation. (d) Negative correlation. 3 


senting an indefinitely large number of possible relations between the two 
у 


variables. 


THE CORRELATION COEFFICIENT 
correlation are conventionally defined to take values ranging 


alue of —1 describes a perfect negative relation. 
and X decreases as У increases. A value of + 
describes a perfect positive relation. All points lie on a straight line, and \ 
increases as У increases. ‘A value of 0 describes the absence of a relation. 
The variable X is independent of У or bears a random relation to Y. 
Measures of correlation take positive values where the relation is positivi 


and negative values where the relation is negative. 

The most commonly used measure of correlation is the Pearso 

product-moment correlation coefficient. Many forms of correlation are par- 

ticular cases of this coefficient. Let X and Y be two sets of paired observa: 
viations s, and sy. We may represent the past 


tions with standard de 
s in standard-score form by taking deviations from the mean 


7.3 


Measures of 
from —1 to +1. А vi 
points lie on a straight line, 


observation rm 
and dividing by the standard deviation. Thus | 
x-X 
же EP. 
y-Y 
a Eo 
* 


100 


[7.1] 


BASIC STATISTICS 


The standard scores have a mean of zero and a standard deviation of unity. 
The product-moment correlation coefficient, denoted by the letter г, is the 


sum of products of standard scores, divided Бу N — 1. The formula for гїп 
standard-score form is 


Ep 
У 


Thus the correlation coefficient m 
variables to standard-score form, 
N= 1. 


ay be obtained by converting the two 
summing their product, and dividing by 


above coefficient i 


(1) the values of zz and 2, 
equal to the value of %; 
dard scores being identical 


such that z, = z,, we may write 
?. The quantity 2,2 = y;2—N —], 
the maximum possible value 
ll take its minimum possible 
Zy are in inverse order and (2) every 
merical value as the z, with which it is 
mum value of 2z,2, is readily shown to 
all points will fall exactly along а 
n Z and 2, bear no systematic rela- 
of 2,2, will be zero. We may define 
i bserved value of Xz,z, to 
quantity; that is, г is defined as 


224, =z: = zj and Yzz,— Хай = Уз, 


Paired, but differs in sign. This mini 


(N — 1). A maximum 
n the paired Observations have the 


ill occur on] when every value 
i i y n every va 
of z is equal to z, in absolute у, iffers in Sign. When the Jata do 


greater than —1. 


7.4 


[7.2] 


Table 7.1 


CORRELATION 101 


CALCULATION OF THE CORRELATION COEFFICIENT 
FROM UNGROUPED DATA 
The formula for the correlation coefficient in standard-score form is 
r= Хак (N — 1). The calculation of a correlation coefficient using this 
formula is somewhat laborious, as it requires the conversion of all values to 
standard scores. Since 2; = (X — X)/s, and 2, = (Y — Y)/s, by substitution 
we may write the formula for the correlation coefficient in deviation-score 
form. Thus 
x(x-X)Y-Y) __ 2w 
VE(X — XZY- VXSEy: 

where x and y are deviations from the means X and Y, respectively. 
Calculation of the correlation coefficient from ungrouped 
data using deviation scores 

à 2 8 жш 6 6 7 

x Y x x = ж” ху 
= р 

5 1 -1 -3 1 9 +3 

10 6 44 '49 16 4 48 

5 2 ES! —2 1 4 +2 

i 8 45 +4 % 16 +20 

12 5 +6 +1 36 1 +6 

Р j mE 4 9 +6 

3 4 -3 0 9 0 0 

Я Б 4 +2 16 4-8 

= 5 +o 41 1 1 +1 

т р 45 = % 4 +0 


Zay __% _ =+.58 


r= жауу VIBE X52 


The above formula for the correlation coefficient may be used for compu- 
tational purposes- The calculation is illustrated in Table 7.1. The first two 
columns contain the paired observations on X and У. These columns are 
summed and divided by N to obtain the means X and У. Column 3 contains 
the deviations from the mean of X, and col. 4 фе deviations from the mean 
of Y. Columns 5 and 6 contain the squares of these deviations. These 

btain Ха? and Ху. Column 7 contains the prod. 


columns are summed to © 


102 


[7.3] 


Table 7.2 


BASIC STATISTICS 
ucts of x and у, and this column is summed to obtain Exy. The correlation 
coefficient in this example is +.58. 


For certain purposes it is desirable to express the formula for the correla- 


tion in terms of the raw scores or the original observations. This formula is 
as follows: 


МУХҮ — 5Х5Ү 
2 УМ; — СМУ: = (2Y)?] 


r 


This is one of the more convenient formulas to use where a calculating 
machine is available. Some modern calculating machines are so designed 
that pairs of observations may be entered successively on the machine and 
the terms УХУ, ZX?, XY?, УХ, and XY obtained in a single operation. 
Where а calculating machine is not available, this formula usually involves 
rather large and unwieldly numbers and the formula in deviation form may 
be preferred. 


The application of the formula for computing the correlation coefficient 


Calculation of the correlation coeffi- 


cient from ungrouped data using 
raw scores 


1 2 3 4 
X Y x: n 


5 1 25 1 5 
10 6 100 36 60 
5 2 25 4 10 
11 з 12] 64 88 
12 5 144 25 60 
4 1 16 1 4 
3 4 9 16 12 
2 6 4 36 12 
7 5 49 25 35 
1 2 1 4 2 
60 40 494 212 288 
УХ УУ Уух? ур УХУ 


МУХҮ —xxxy 
VINX — (EXYTINZYI— yy] 
10 x 288 — 60 x 40 
© Ух ХЕ 
480 


— ISSN) 
1,340 x 520 


г= 


) 


1.6 


CORRELATION 103 


from raw scores is illustrated in Table 7.2. The first two columns contain 
the paired observations on X and Y. These columns are summed and 
divided by N to obtain УХ and XY. Columns 3 and 4 contain the squares of \ 
the observations, and these are summed to obtain УХ? and ZY?. Column 5 
contains the product terms XY, and the sum of this column is УХУ. The 3 
correlation is +.58, which checks with the value obtained by the previous : 


method using deviation scores. 


FREQUENCY DISTRIBUTIONS Ж 
discussed the construction of frequency distributions for a ў 
single variable. A frequency distribution was defined as an arrangement of | 
the data showing the frequency of occurrence of the observations within ғұ 
defined ranges of the values of the variable, the defined ranges being the 
class intervals. Where one variable only is involved, the distribution may. . 
be spoken of as univariate. The frequency-distribution idea may be readily | 
extended to two-variable situations. А frequency distribution involving two A 
variables is known as à bivariate frequency distribution. E 
А bivariate frequency distribution is a table comprised of a number of | 
rows and columns. The columns correspond to class intervals of the X vari- 
able and the rows to class intervals of the Y variable. Each pair of observa: | 
tions is entered as à tally in its appropriate cell. To illustrate, Table 7.8 - 
shows a bivariate frequency distribution for a set of paired observations, 
these being scores on two forms of a French reading test. In constructing | 
such a distribution a person who makes a score of 27 on Form A and a _ 
score of 31 on Form B is entered as а tally in the cell that is common to the 
row corresponding to the class interval 25 to 29 on Form A and the column | 
corresponding to the class interval 30 to 34 on Form В. Similarly, every pair | 
of observations is entered as a tally in its appropriate cell. The tallies in & 
each cell are then counted, and their number recorded. These numbers аге | 
the bivariate frequencies. By summing the bivariate frequencies in the % 
rows we obtain, 45 shown in Table 7.3, the frequency distribution for the У. | 
variable, and by summing the columns we obtain the frequency distribu- d 
the X variable. The separate frequency distributions of X and Y are. 
he bottom and to the right of the table. In the selection 
for X and Y the usual conventions regarding class 1 
с 


ері л» b 


BIVARIATE 
In Chap. 2 we 


tion for 
usually written at t 
of class intervals 
; ly. 

занен for the calculation of correlation coefficients from | 
bivariate frequency distributions. These methods are now infrequently | 
used and will not be described here. The idea of a bivariate frequency table | 


i nce. 
is, however. of some importa ; 


RIANCE oF SUMS AND DIFFERENCES 
Y be two sets of measurements for the same group of individuals 


may be marks on mathematics and history examina. 
iversity students. What is the variance of X + Y? If 


THE VA 
Let X and 
These, for example, 
tions for a grouP of uni 


104 BASIC STATISTICS 


- % А 
Table 7.3* Bivariate frequency distribution for two forms of French reading test 
X-Form B 


0-4 5-9 10-14 15-19 20-24 25-29 30-34 35-39 f, 


Г и | 
35-39 2 1 3 
4 
І | 
30-34 1 1 2 
| 
f Ww ow | 
25-2 all 3| 2 I 6 
EN Ho wo |ш | 
ч 7 | | 2 3 3 1 9 
Е 
Е о МИ 
X Am | 9| as | 18 
Sot F Wum || 
+ 1 8 
| a os 2 | 
го |! шо UII 
1 3 3 7 
| 
0-4 1 1 | 2 
f. 5 7 13 12 7 6 3l 


* Table 7.3 is reproduced from R. W. B. Jackson and Geor, 


ge А. Ferguson, Manual of educational statistics, 
University of Toronto, Department of Educational Rese: 


arch, Toronto, 1942. 


mathematics and history marks are added together, 
the sums? 


The sum of X and Y is X + У. The mean of the 


or the sum of the two means. We may then writ 


what is the variance of 


sum of X and Y is X + Y, 


€ the variance of sums as 


follows: 
TRECE: 
Z[X- X) + Y-F) 
g N=] 
У(Х — Ху 4,2 — Yy + 22(X — X)(Y - Y) 
V1 N=] у= 


= s + s? + 2735, 


The variance of the sum of X and Y is the sum of the 
2rs,s,. If the correlation between the two variables is 2 
and the variance of sums is simply the sum of the two 


two variances plus 
ero, then 2rs,s, = 0 
variances. Terms of 


ma WE 


\ 
CORRELATION 105 


the kind rs;s, are sometimes called covariance terms, or covariances. 
Similarly, the variance of the differences between X and Y, the variance 
of X — У, is readily shown to be 


[7.5] 
The variance of differences is the sum of the two variances minus th 


covariance term 2rs;Sy. 
The sum of more than two variables may be usefully considered. Con- | 


sider three variables X;, X», and Хз. The variances and covariances may 
for convenience be written in the form of a covariance table, or matrix, as 


follows: 


1 А 7125152 
2 
3 Гэз5з5з 


The variances appear along the main diagonal and the covariances on 
either side of the main diagonal. The variance of sums is simply the sum of 
all ше elements in the covariance table or matrix. Thus, for three variables | 


the variance of sums is: 
= s? + sè + 53° + 27155152 + 27135153 + WrogS08q 


514243 
The argument may be generalized to any number of variables. The 
variance of the sum of k variables is very simply the sum of all the elements 


of the covariance table or matrix with k rows and columns. 


мса — HP———————————: 


D 


EXERCISES 1 Would you expect the correlation between the following to be posi- 
tive, negative, or about zero? (a) The intelligence of parents and their 
offspring, (5) scholastic success and annual income 10 years after 
graduation, (c) age and mental ability, (d) marks on examinations in 
physics and mathematics, (e) wages and the cost of living, (f) birth - 
rate and the numerosity of storks, (2) scores on a dominance-submis- 
sion test for husbands and scores for their wives. 


The following are paired measurements: 


8 9 1 6 1 


x 8 
ү 3 Т S 8 5 9 
Compute the correlation between X and Y. 


22 
15 


10 


BASIC STATISTICS 
Show that 


Улу ____ Улу _ 
СА Da Very 


When N = 2, what are the possible values of the correlation coeffi- 
cient? 

The correlation coefficient is not necessarily equal to +1 when the 
paired measurements are in exactly the same rank order. Discuss. 


Calculate the correlation coeffic 


ient for the following data using 
formula [7.3]. 


Show that s,_,? = 5,2 + 512 — 2rs,s,. 


Under what conditions will the у 


ariance of the sum of two variables 
equal the variance of the differe 


nce between two variables? 
Is the correlation between X and Y changed by adding a constant to X 
or by multiplying X by a constant? 


The formulas in this chapter assume that the 
s? = У(Х — X)? (N — 1). What is the formula 
form if the variance is defined ass*—X(x— 


variance is defined as 
forrin standard-score 


X)*IN? 


8.1 


“е 


CA 


PREDICTION IN RELATION 
TO CORRELATION 


ај ман e d 


WD ВИНЫ UL 


INTRODUCTION 
Psychologists and educationists are frequently concerned with problems of. 
prediction. The educational psychologist is interested in predicting thi 3 
scholastic performance of a child from a knowledge of P men. wi 
s. The industrial psychologist in the selection of an individual for a | 
particular type of employment makes a prediction about the імен A 
job performance of that individual from information available at the time of 
selection. The clinical psychologist may direct his attention to predicting - 
the patient’s receptivity to treatment from information obtained prior M 
treatment. In many areas of human endeavor predictions about the su 
sequent behavior of individuals are required. A somewhat elaborate stati: № 
tical technology has evolved for dealing with the prediction problem г. 
this chapter we shall restrict attention to the simplest aspect of prediction 
the prediction of one variable from a knowledge of another. "d 
Prediction and correlation are closely related topics, and an unde | 
standing of one requires an understanding of the other. The presence e 
zero correlation between two variables X and Y may usually be interpret E. 
to mean that they bear no systematic relation to each other. A knowled, e Я 4 
X tells из nothing about У, and a knowledge of Y tells us nothing about 57! 4 
predicting X from У or У from X no prediction better than a random m. is ў 
The presence of a nonzero correlation between X and У, impli 54 
know something about X we know something about Y, » i les | 
owing X implies some knowledge of У, a prediction А: = vice | 
hich is better than a random guess about Y made - 
wledge of X. The greater e 
no g g the absolute value of the Canela 


score: 


possible. 
that if we 
versa. If kn! 
is possible w 
absence of a ki 


107 


108 BASIC STATISTICS 


tion between X and Y, the more accurate the prediction of one variable 
from the other. If the correlation between X and У is either —1 or +1, 
perfect prediction is possible. 


8.2 THE LINEAR REGRESSION OF Y ON X 


Any set of paired observations may be plotted on graph paper, each pair of 
observations being represented by a point. Consider the data shown in 
Table 8.1, cols. 2 and 3. These columns contain intelligence quotients and 
reading-test scores for a group of 18 school children. These data are plotted 
in graphical form in Fig. 8.1. While the arrangement of points when plotted 
graphically shows considerable irregularity, we observe a tendency for 
reading-test scores to increase as intelligence quotients increase. 


Table 8.1 Calculations for regression line of Y on X for ungrouped data* 


1 2 3 4 5 


Reading NN 
Pupil IQ score reading 
no. x F x XY score Y' 
————————————— 
1 118 66 13,924 7,188 68 
2 99 50 9,801 4,950 55 
3 118 73 13,924 8,614 68 
4 121 69 14,641 8,349 70 
5 123 72 15,129 8,856 71 
6 98 54 9,604 5,292 54 
7 131 74 17,161 9,694 77 
8 121 70 14,641 8,470 70 
9 108 65 11,664 7,020 61 
10 11 62 12,321 6,882 63 
П 118 65 13,924 7,670 68 
12 112 63 12,544 7,056 64 
13 113 67 12,769 7,571 65 
14 111 59 12,321 6,549 63 
15 106 60 11,236 6,360 60 
16 102 59 10,404 6,018 57 
17 113 70 12,769 7,910 65 
18 101 57 10,201 5,757 57 
бит 2,024 1,155 228,978 130,806 


———————————M— ни 
*Reproduced from В. W. B. Jackson and George А. Ferguson, Manual of educational 
statistics, University of Toronto, Department of Educational Research, Toronto, 1942. 


Fig. 8.1 


[8.1] 


PREDICTION IN RELATION TO CORRELATION 109 · 


> 
а 


¥;réadingacore 


55 


50 


100 105 110 115 120 125 130 135 140 145 й 
Х,1.0. 


Scatter diagram for data of Table 8.1. 


Let us suppose that we are given a child’s intelligence quotient only and 
are required to predict his reading-test scores. How shall we proceed? 
Clearly, the data show considerable irregularity. An exact correspondence 
between the two sets of scores does not exist. In this situation we may 
proceed by fitting a straight line to the data. This straight line provides an 
average statement about the change in one variable with change in the 
other. It describes the trend in the data and is based on all the observa- 
tions. If, then, we are given a child's intelligence quotient and are required 
to predict his reading-test score, we use the properties of the line. The 
method used in fitting a line to a set of points in a situation of this kind is 
the method of least squares. If our interest resides in predicting Y from X 
the method of least squares locates the line in a position such that the süm 
of squares of distances from the points to the line taken parallel to the Y 
axis is a minimum. This line is known as the regression line of Y on X. 

The general equation of any straight line is given by d 


y=bX+a 


The quantity ais a constant. It is the distance on the Y axis from the Ж. 
to the point where the line cuts the Y axis. It is the value of Y mm 
to X — 0. If we substitute X — 0 in the equation for a straight line, we 
observe that Y = a. The quantity біз the slope of the line. The slope okan 
line is simply the ratio of the distance in a vertical direction to the ал Й 
in a horizontal direction, as illustrated in Fig. 8.2. The slope describes the 
rate of increase in Y with increase in X. If a and b are known, the location ie 
the line is uniquely fixed and forany given valueiof X we can € 
corresponding value of Y. 
Where the regression line of Y on X is fitted by the method of 1, 
e slope of the line by, and the point where the line cuts the Y 


squares, th 
y be calculated by the following formulas: 


axis ayz M4 


>? 


[8.2] 


[8.3] 


8.3 


BASIC STATISTICS 


The slope of a line. 


_ ZXY — (ZXXY]JN) 


b= 5X LEXIN] 
Е ZY—b,ZzX 
vr N 


The quantity EX is the sum of X, ХУ is the sum of У, УХУ is the sum of 
products of X and У, УХ? is the sum of squares of X, and N is the number of 
cases. 

To illustrate, consider the data of Table 8.1. Columns 2 and 3 provide 
intelligence quotients and reading scores for the 18 school children. 
Column 4 provides the values X?, and col. 5 the products XY. Summing the 
columns, we obtain 


УХУ = 130,806 
УХ = 2,024 
УУ = 1,155 
УХ? = 228,978 

№ = 18 


Applying formulas [8.2] and [8.3], ме have 


_ 130,806 — 2,024 x 1,155/18 
bue — "398978 — (202718 7 -0708 


159 —. х 2, 
mms 1,155 е 2,024 =—11.25 


The regression line of У on X is then dese 


ribed by the equation У’ = 
-6708X — 11.25. The symbol Y’ 


has been introduced to refer to the 
estimated value of Y, that is, the value of Y estimated from a knowledge of 


X. Y' isa distance from the X axis to the line corresponding to any value of 
X. By substituting any value of X in the formula we obtain Y', the estimated 
value of Y. Column 6 of Table 8.1 shows the estimated reading-test scores 
obtained by applying this regression equation. 


THE LINEAR REGRESSION OF X ON Y 


Above we have considered the regression of Y on X. The regression line has 
been located in order to minimize the sum of Squares of the distances from 


[8.4] 


[8.5] 


[8.6] 


8.4 


PREDICTION IN RELATION TO CORRELATION 11L 


the points to the line taken parallel to the У axis. Given reading-test scores | 
and intelligence quotients, we concerned ourselves with predicting | 
reading-test scores from intelligence quotients. If, however, we wish ү 
predict intelligence quotients from reading-test scores, a different regres- 
sion line is used. This is the regression line of X on У. This line is located in | 
a position such as to minimize the sum of squares of the distances from the _ 
points to the line parallel to the X axis. We see, therefore, that two regres- ў 
sion lines тау be fitted to any set of paired observations, the regression 
line of У on X and the regression line of X оп Y. The regression of Y on X is > 
used in predicting У from X. The regression of X on Y is used in predictin 
X from Y. These two lines will differ except in the particular case where all | 
the points fall exactly оп а straight line. Under this circumstance the two 
regression lines coincide. | 
The formula for the regression line of X on Y is given by 


X' = bay + asy 


The symbol X’ is used to refer to the predicted value of X, the value 4 
estimated from а knowledge of У; bz, is the slope of the regression line; and : 
ау, is the point where the line intercepts the X axis. The values of b í and 4 
ағу may be calculated from the formulas 254 


p, = ZAY — (ZXZYIN) 
zu — Sy? — [(2Y)7/N] 


and 
_ EX- bn 2Y ` 
ч N 


ағу 


REGRESSION LINES FOR A BIVARIATE 
FREQUENCY DISTRIBUTION 


hera dita are grouped in the form of a bivariate frequency table the 
frequencies in each row or each column of the table constitute a frequenc 
distribution. Table 8.2 is а bivariate frequency distribution for scores on 1 
verbal intelligence test and the Binet intelligence test. We note, for 
um that 104 individuals make scores between 100 and 109 oi the 
Baers frequencies in the 100-109 column comprise the frequency dis- 
tribution of scores on the verbal test for all individuals with IQ’s betwe 
100 and 109 on the Binet. The mean score on the verbal test for these 104 
individuals can be readily calculated from the distribution in the 100-109 
шен If we know only that an individual’s IQ falls between 100 and 109 
d. best estimate We can make of his verbal-test score is that he is at th М 
Шен де individuals with IQ's between 100 and 109. The means fo all № 
P а аңауылау be calculated. These are the mean verbal.test Scori : f 
ool петите falling within particular class intervals on the Binet scala al | 
ыкы Sig Re i to this кү means by the method of e | 

ine is the regression o ion li 

lin gr on X, the regression line used in 


ee арг Ее ЧР. "vw 


squares. This 


“риероэб *цйзпчшрд ‘uonvonpy 


1732 10} пәипогу 


4911096 ayy, Jo шовїшдә@ цим poonpoaday ов ‘uopuor] *взәл{ uopuo] уо Хивзәлїш[ ‘иэдрруэ 45111025 fo dnos? әаюиәзәлйәл v fo әзиәЎүүүәзи ayy *йәҗәәшәп "jp "ү uro] тва, 
65-04 + ASB’ = ХА Чо Х Jo uorssoasoy :g our] 
PTEE — ХОР = Д‘Х Чо Д Jo uorssaiday су әш" 


-0<І GPI-0FI 6£I-0£I 6ZI-OZI 611-011 601-001 66-06 68-08 62-02 69-09 65-05 


^f 691-091 6ST 
У (ді 10u18) X 


„д чо y pue x uo д jo sour повод Яд Эшмоцв 1591 99uo3io1ur 


jourg әч рию 1591 aUI pu; [pq194 v uo 8504 [0045$ 4811055 005 Aq peureiqo вәлоов jo цонпЧьцер Хопәпһәлу BULAI 


(24093 үодләа) д 


св IALL 


112 


[8.7] 


[8.8] 


[8.9] 


[8.10] 


PREDICTION IN RELATION TO CORRELATION 113 


predicting verbal-test scores from Binet IQ's. Similarly, the means for the 
row arrays may be calculated and a line fitted to these means by the 
method of least squares. This line is the regression of X on Y, the regres- 
sion line used in predicting Binet IQ's from verbal-test scores. The two 
regression lines are shown in Table 8.2. 


RELATION OF REGRESSION TO CORRELATION 


If all points in a scatter diagram fall exactly along a straight line, the two 
regression lines coincide. Perfect prediction is possible. The correlation 
coefficient in this case is either —1 or +1. Where the correlation departs 
from either —1 or +1, the two regression lines have an angular separation. 
In general, as the degree of relationship between two variables decreases, 
the angular separation between the two regression lines increases. Where 
no systematic relationship exists at all, the two variables being indepen- 
dent, the two regression lines are at right angles to each other. 

A simple relationship exists between the correlation coefficient and the 
slopes of the two regression lines. The slopes of the regression lines when 
expressed in deviation-score form are given by 


5. ZU - Z)y- Y) 
ue (N-Ds _ 
ь SEC eB 
w= (N-Us? 


Since r2 Z(X — X)(Y - У)/(М— 1) 5,5, 


Sy 

bye = г 
шет. 
Sz 

bay =r 


Multiplying these two expressions, we obtain 


=r? 
byrbry = Г 


Thus the product of the slopes of the two regression lines is the square of 


the correlation coefficient. The geometric mean of the two slopes is the cor- t 


relation coefficient. 

Because of the above relation between correlation and regression, we 
may write equations for the two regression lines, using the correlation coef- 
ficient. The two equations are as follows: 


y 2r (X-X) +7 


These are commonly used equations for predicting a raw score on one vari- 
able from a knowledge of a raw score on another. 


114 


[8.11] 


[8.12] 


BASIC STATISTICS 


If measurements are represented in standard-score form, the correlation 
coefficient may be written as r= Xz,z,/(N — 1). If the pairs of standard 
scores are plotted graphically and two regression lines fitted to the data, 
the equation of these lines may readily be shown to be 


у = rz. 
2'.= Гу 


where z', and 27, are the predicted or estimated standard scores. Both 
regression lines have the same slope, which is equal to the correlation coef- 
ficient. In this case the slope of the regression of Y on X relative to the X 


axis is the same as the slope of the regression of X on Y relative to the Y 
axis. & 


ERRORS OF ESTIMATE 


In predicting one variable from a knowledge of another, distances from 
either the X or Y axis to the regression line are used as the predicted 
values. A difference between an observed value and a predicted value is an 
error of estimate. Thus in predicting Y from X, the predicted value Y' is a 
distance from the X axis to the regression line, and the difference between 
the observed value of Y and the predicted value, or Y — Y', is an error of 
estimate. If the pairs of observations, when plotted graphically as points, 
all fall exactly along a straight line, all values of Y — У’ = 0 and perfect 
prediction is possible. If the points appear to be arranged at random when 
plotted graphically, many values of Y — Y' will be large. The more accurate 
the predictions possible, the smaller the values of Y — Y' will tend to be. 
The variance of the errors of estimate, that is, of Y — Y' 
measure of the accuracy of estimate and is given by 


„an ИФ 
т 


‚ is taken as a 


The square root of this quantity is the standard error of estimate in 
predicting Y from X. 

The reader should note that 5,,,?, as defined above, is not an unbiased 
estimate of g;,;?. The number of degrees of freedom associated with the 
sum of squares (У — Y')? is not N — 1 but N — 2, there being № — 2 
deviations about the regression line which are free to vary. Here we have 
rather arbitrarily defined the standard error of estimate using № — 1, and 
not № — 2, to simplify subsequent exposition. For purposes of descriptive 
statistics it is algebraically more convenient to use N — 1 and not N — 2. 
When, however, a problem of estimation is involved, a 


N — 2 should be used. 


In predicting X from a knowledge of Y, the variance of the errors of 
estimate is 


_ X(X- Xy 


definition using 


18.18] sz =~ үт 


[8.14] 


[8.15] 


[8.16] 


8.7 


[8.17] 


PREDICTION IN RELATION TO CORRELATION 115 — 
where X is an observed value and X' is a value of X estimated from a 
knowledge of Y. The square root of this quantity is the standard error of | 
estimate in predicting X from Y. ү 
The standard error of estimate is related to the correlation coefficient by. M 
E 


the simple relation 


dp. enm VIT f 
and similarly b 
Зар = 5: МІ г 


By transposing these formulas we obtain relations as follows: 


z z 
Syr — dy. 

= = 1-3 
Sy Sr 


The above constitutes, in effect, an alternative definition of the correlation 
coefficient. If all pairs of points when plotted graphically fall exactly on а. 

straight line, both 5,7 = 0 and 5212 = 0. In consequence, r will be either. 4 
+1 or —1, depending on whether we take the positive or the negative | 
square root. If the points are arranged at харчо when plotted graphically, | 


"-— ж 


X and У being independent of each other, Szy? = s,*, 5,12 = 5,2, and r= 
The value of the correlation is seen, therefore, to depend on the ratio of 
two variances, 5,5" [Sy or Sz.,7/sz". These two ratios are equal. 


THE VARIANCE INTERPRETATION % 
OF THE CORRELATION COEFFICIENT 


A correlation coefficient is not a proportion. A coefficient of .60 does not’ 
represent a degree of relationship twice as great as a coefficient of .30. The | 
difference between coefficients of .40 and .50 is not equal to the difference | 
between coefficients of .50 and .60. The question arises as to how correla- 
tion coefficients of different sizes may be interpreted. One of the more | 
informative ways of interpreting the correlation coefficient is in terms of 


value Y' and an error of estimation (Y — Y). HenceY 2 Y' + (Y— үзу 
These two parts are independent of each other; that is, they are un- 
correlated. The variances of the two parts are directly additive, and we 


may write 


variance. 
A score on Y may be viewed as comprised of two parts, an estimated 

| 
= sy? + Sy, x | 


where s,? = variance of Y | 
sy? = variance of values of Y predicted from X, that is, values on. - 
regression line 


sy, = variance of errors of estimation 


The variance Sy. = s, (1 — r?). By substitution we obtain 


116 


[8.18] 


[8.19] 


BASIC STATISTICS 
512 = sy”? + 5,7(1 — г?) 
Dividing this equation by 5,2 and writing it explicit for г?, we have 


sy? 
5/2 


г? = 


Similarly, it may be shown that 


These expressions state that r? is the ratio of two variances, the variance of 
the predicted values of Y or X divided by the variance of the observed 
values of Y or X. 

The variance 5, is that part of the variance of Y which can be predicted 
from, explained by, or attributed to the variance of X. It isa measure of the 
amount of information we have about Y from our information about X. If 
r = .80, r? = .64, and we can state that 64 per cent of the variance of the 
one variable is predictable from the variance of the other variable. We 
know 64 per cent of what we would have to know to make a perfect predic- 
tion of the one variable from the other. Thus г? can quite meaningfully be 
interpreted as a proportion and г? x 100 as a per cent. In general, in 
attempting to conceptualize the degree of relationship represented by a 
correlation coefficient it is more meaningful to think in terms of the square 
of the correlation coefficient instead of the correlation coefficient itself, 


The values of г? x 100 for values of г from .10 to 1.00 are as follows: 


г  PXx100 
10 1 
20 4 
30 9 
40 16 
50 25 

60 36 
10 49 
.80 64 
.90 81 

1.00 100 


Thus а correlation of .10 represents а 1 per cent association, a correlation 
of .50 represents а 25 per cent association, and the like. A correlation of 
.7071 is required before we can state that 50 per cent of the variance of the 
one variable is predictable from the variance of the other. With a correla- 
tion as high as .90 the unexplained variance is 19 per cent. 

The existence of a correlation between two variables is indicative of a 


8.8 


Fig. 8 


PREDICTION IN RELATION TO CORRELATION 117 


functional relationship, but does not necessarily imply a causal rela- 
tionship. Whether a functional relationship can be regarded as a causal | 
relationship is a matter of interpretation. The correlation between the 
intelligence of parents and their offspring has been frequently reported to 
be of the order of .50. This may be interpreted as indicative of a causal rela- 
tionship. Frequently two variables may correlate because both are corre- 
lated with some other variable or variables. For example, given a group of 
children with a substantial range of ages, a correlation may be found 
between a measure of intelligence and a measure of motor ability. Such a 
correlation may come about because the measures of intelligence and 
motor ability are both correlated with age. If the effects of age are removed. 
the correlation may vanish. г 


ASSUMPTIONS UNDERLYING THE 

CORRELATION COEFFICIENT 

In interpreting the correlation coefficient it is assumed that the fitting of 
two straight regression lines to the data does not distort or conceal the 
functional relation between the two variables. If the relation is curvilinear, 
a coefficient of zero may be obtained and yet a close relation may exist | 
between the two variables. Figure 8.3 shows a curvilinear relation between 
X and Y. ИХ is known, a fairly accurate prediction can be made of Y. If 
however, two straight regression lines are fitted to the data, these lines will 
be about at right angles to each other and r will be about zero. If a strictly 
random relation exists between X and Y, the correlation will be zero. The 
above example demonstrates that the converse does not hold. If the corre- 
lation is zero, it does not necessarily follow that X and Y bear a random 
relation to each other. This may mean that the linear-regression model is a 
poor fit to the data. In interpreting the correlation coefficient it is ordinarily - 
assumed that the linear-regression model is a good fit to the data and that a 
correlation of zero means a random relation. Consider a situation where 
г = .80. This means that 64 per cent of the variance of the one variable is 


== 
ol tie le |» 
м ET Бы 1-1” 
. EO daa 
-- 8 gw wm | * Te |e 
2 
ы, 14, 
E ДЖ 
* mr — 1 4 
3 
Y fa 6-74 
= тей рт 
Le | NC i 
. 4 ғ. 
° e| e| e 


X 


3 Scatter diagram showing curvilinear relation. 


118 


EXERCISES 


BASIC STATISTICS 


predictable from the other and the residual 36 per cent is due to other 
factors. The assumption is that these factors do not include, at least to any 
appreciable extent, a lack of goodness of fit of the linear-regression lines to 
the data. If a large proportion of the residual 36 per cent did result because 
of nonlinearity, this would affect the interpretation of the data. In inter- 
preting a correlation coefficient the investigator should satisfy himself that 
the linear-regression lines are a good fit to the data. Any gross departure 
from linearity can readily be detected by inspection of the bivariate 
frequency table. For small values of N, curvilinear relations may be dif- 
ficult to detect. In practice, for many of the variables used in psychology 
and education the assumption of linearity of regression is in most instances 
reasonably well satisfied. 

In calculating a correlation coefficient it need not be assumed that the 
distributions of the two variables are normal. Correlations can be com- 
puted for rectangular and other types of distributions. If the two variables 
have different shapes, however, this circumstance will impose constraints 
upon the correlation coefficient. If a positively skewed distribution is 
correlated with a negatively skewed distribution, the differences in the 
shapes of the distributions will influence the correlation coefficient. Some 
part of the departure of the correlation coefficient from unity will result 
because of the different shapes of the two distributions. In such a situation 
as this the differences in shapes of the distributions will in effect ensure 
that one or the other or both regression lines are nonlinear, In psycholog- 
ical research substantial differences in the shapes of the distributions 
under study occasionally are found. Under these circumstances it is 
common practice to transform the variables to a binomial or to an approxi- 
mately normal form. Such transformations will frequently tend to eliminate 
curvilinearity of regression. 

Many other circumstances affect the correlation coefficient. Among 
these may be mentioned sampling error and errors of measurement, The 


effects of these on the correlation coefficient are discussed i 


n later 
chapters. 


тесшн тты 2... 


1 The following аге paired measurements: 

X 1 5 6 6 2 

Y 2 4 5 3 1 

Compute (a) the correlation between X and Y, (6) the slope of the 
regression line for predicting X from a knowledge of Y, (c) the slope of 
the regression line for predicting Y from a knowledge of X, (d) the 
regression equation for predicting a standard score on X from a stan- 
dard score оп У, (e) the regression equation for predicting a raw score 
on X from a raw score on У, (f) the variance of the errors of estimation 
in predicting Y from X. 


Hai 


2 


PREDICTION IN RELATION TO CORRELATION 119 


The following are marks on a college entrance examination X and 
first-year averages Y for a sample of 20 students. 


Compute (a) the correlation between entrance examination marks and 


first-year averages, (b) the regression equation for predicting first-year _ 


averages from examination marks, (c) the predicted first-year averages 
for the 20 students, (4) the variance of the errors of estimation. 


Standard scores on variable X for four individuals are —2.0, —1.68, .18, 


1.16. The correlation between X and Y is .50. What are the estimated | 


standard scores on Y? What is the standard error in estimating stan- 
dard scores on Y from standard scores on X? 


From the data X = 40.3, Y = 12.5, $, = 12.6, s, = 3.6, and Tay = 6000 
write the regression equations for predicting Y from X and X from У. | 


Show that rz V byzbzy. 


A correlation of .7071 may be interpreted to mean that 50 per cent of 


- 


the variance of one variable is predictable from the other variable. Is 5 


this statement correct if the regression lines аге not linear? 


A variance s,?= 400 and the correlation between x and Y is .50. What Le 
is the variance of the errors of estimation in predicting У from X? What - 


is the variance of the predicted values? 


What correlation between X and Y is required in order to assert that 75 


рег cent of the variance of X depends on the variance of Y? 


In predicting Y from a knowledge of X, the standard error of estimate is 
5. and the mean of the errors of estimation is zero. Assuming the errors 


№ 


\ 


| 
: 
4 
1 


to be normally distributed, indicate the limits above and below the №. 


mean that include 95 per cent of the errors of estimation. 


122 


BASIC STATISTICS 


On occasion, samples are drawn from lists which do not provide a complete 
record of all members of the population, but are viewed, perhaps erron- 
eously, as representative of the population. A telephone directory is such a 
list. Names chosen from a telephone directory may yield a biased sample of 
the population at large, because ownership or nonownership of a telephone 
may not be independent of the variable under investigation, e.g., how a 
person intends to vote in an election. 

Forms of modified random sampling are sometimes used. One example 
is stratified random sampling. This procedure requires prior knowledge, 
perhaps obtained from census data, about the number or proportion of 
members in the population of various strata. Thus we may know the 
number of males and females, the number in various age groups. and the 
like. In constructing the sample, members are drawn at random from the 
various strata. If the members are drawn such that the proportions in the 
various strata in the sample are the same as the proportions in those strata 
in the population, the sample is a proportional stratified sample. For 
example, a university may have 10,000 students, of which 7,000 are males 
and 3,000 are females. A sample of 100 students is required. We may draw 
70 males by a random method from the subpopulation, or stratum, of 
males, and 30 females from the subpopulation, or stratum, of females. 
Such a sample is a proportional stratified sample. 

In much experimentation using human or animal subjects, the popula- 
tions from which the samples are drawn may not be amenable to precise 
definition, and the methods of sampling described above may have little 
relevance. For example, a laboratory experiment may use a sample of 
experimental animals. Can such a sample be viewed meaningfully as a 
random sample from a known population of animals? Ina study of the ther- 
apeutic effects of different operative procedures applied to a certain class 
of brain tumor, all cases admitted to a particular hospital during a specified 
time period may be used. This number may be small. In what sense may 
this group of cases be viewed as a random sample drawn from a larger 
group or population? Again, in an educational experiment the pupils in two 
or three classes in a particular school and grade may be used as subjects. 
Can such a group of subjects be viewed as a random sample or its equiva- 
lent? Such samples as these are clearly not random. The method by which 
they are selected is not a random method. Despit 
wish to draw inferences that transcend the parti 
He may wish to argue that his findings are pr 
although perhaps ill-defined, population of ex 
the therapeutic effects of his operative ргосеф 
patients suffering from the particular class 
results of the educational experiment may be 
group of school pupils in the same grade. 

In situations of the above type, where strict random sampling proce- 
dures have not been used, does any basis exist for valid inference? A 
common practice is to investigate a posteriori a variety of characteristics of 


e this the investigator may 
cular samples under study. 
obably true for some large, 
perimental animals; or that 
ures may be extended to all 
of brain tumor; or that the 
generalized to a much larger 


SAMPLING 123 02 


the sample. It may be possible to show that the sample does not differ 
appreciably in these characteristics from a larger group or population. 
Thus in the educational experiment the sample of students may be studied 
with respect to age, sex. IQ, socioeconomic level of the parents, and other 
characteristics. The sample may not differ much in these respects from а | 
larger group or population in the same grade. Because the sample shows no 
bias on a number of known characteristics, that is it may not differ froma 


` random sample as far as these characteristics are concerned, the inves- 


9.3 


tigator may be prepared to regard it as representative of the larger group or 
population and treat it as if it were a random sample. Frequently precise | 
knowledge is lacking about a larger reference group or population, and the 
investigator must rely on accumulated past experience and intuition in the 
attempt to detect possible bias. Clearly, where possible, random sampling 
is to be preferred to methods such as these. It must be recognized, 
however, that were we to insist on rigorous random sampling methods, р 
much experimentation would not be possible. » 
In practice, experiments are clearly not always conducted in the way our | 
statistical preconceptions suggest they should be conducted. Experienced | 
experimentalists are frequently aware of this. Much of the art of the experi- - 
mentalist is concerned with reaching conclusions from data which do not 
satisfy some of the conditions necessary for rigorous inference. 


SAMPLING ERRORS | 
Let из now consider the nature of the errors associated with particular | 
sample values. What precisely is a sampling error? A sampling error is a | 
difference between a population value, or parameter, and a particular | 
sample value. Thus, if м is the population value of the mean and X; is 
an estimate based on a random sample of size N, then the difference 
p= X, = е, where e; isa sampling error. Let us suppose that we know that - 
the mean scholastic aptitude test score for a population of 5,000 university 
students is м = 562. А sample of 100 provides. an estimated mean of 
X, = 566. The sampling error in this case is ш — X, = 562 — 566 = —4. Or- 
dinarily, jJ, is not known, and we are unable to specify e; exactly for any par- | 
ticular sample. Despite this, meaningful statements can be made about the 
magnitude of error which attaches to X; as an estimate of the parameter д. 
The reader should note that the concept of error in any context always 
implies a parametric, true, fixed, or standard value from which a given ob- 
served value may depart in greater or less degree. The idea that something 
in the nature of a parametric or true value can meaningfully be defined is 
essential to the concept of error. Without some appropriate definition of 
such a value, the concept of error has no meaning, and no theory of error is 
possible. Also no science is possible. 

How may the magnitude of error be estimated and described? Common 
sense suggests that in the measurement of any quantity some appreciation 
of the magnitude of error may be obtained by repeating the measurements 


124 


BASIC STATISTICS 


a number of times, presumably under constant conditions, and observing 
how these repeated measurements vary from each other. Thus in the 
measurement of the length of a bar of metal a series of separate measure- 
ments may be made under constant conditions. Let us suppose that five 
such measurements are 55.95, 56.23, 56.25, 56.41, and 56.54 in. In this case 
each measurement is an estimate of the same “true” length; hence the 
variation observed with repeated measurement is due to error. Let us 
suppose that five additional measurements are made using another mea- 
suring operation or procedure, these measurements being 54.80, 55.31, 
56.44, 56.52, 57.29 in. This latter set shows greater variation with repetition 
than the former. We may conclude that the magnitude of error associated 
with this latter set of measurements is greater. 

The above example is concerned with errors associated with particular 
observations, namely, measurements of a bar of metal. In considering the 
magnitude of error associated with, say, a sample mean Х,, as an estimate 
of a population mean џ, the situation is similar. The problem may be 
approached experimentally by considering how values of X, vary in 
repeated samples of size N. Thus the mean scores on the scholastic apti- 
tude test for five different samples of 100 students, drawn from a popula- 
tion of 5,000, may be 552, 558, 562, 568, and 569. These five sample means 
may be viewed as estimates of the same population mean ш; that is, the 
value that would have been obtained were information available on all 
5.000 members of the population. The variation of these five means one 
from another may be attributed to sampling error. 

In general, a number, say k, of samples of size № may be drawn at 
random from the same population and a mean calculated for e 


culated for each sample. 
These means may be represented by the symbols Х,Х,Х,...,Х, We 
may write 
„= Х, ey 
и Х» = 
ш Хз = ез 
и X, е, 


The variance and the standard deviation are ordinarily used to describe the 


magnitude of variation in any set of observations or values, In describing 
the magnitude of variation in sample means with repeated sampling, the 
variance and standard deviation are also used. These statistics describe 
the magnitude of sampling error, that is, the magnitude of error associated 
with X; as an estimate of р. Note that the variance and standard deviation 
of sample means Х are the same as the variance and standard deviation of 
the sampling error e;, because И is a constant. 


9.4 SAMPLING DISTRIBUTIONS 


In the above discussion the problem of estimati 


ng error has been 
approached experimentally; that is, we considered thi 


е actual drawing of a 


SAMPLING 125 


number of samples and approached the experimental study of error 
through observed sample-to-sample fluctuation. Consider for illustrative 
purposes a small finite population of eight members. Let the members of 
the population be cards numbered from 1 to 8. These cards may be 
shuffled, a sample of four cards drawn without replacement, and a mean 
calculated for the sample. This procedure may be repeated 100 times, and 
a frequency distribution made of the 100 sample means. This distribution 
is an experimental sampling distribution, and its standard deviation is a 
measure of the fluctuation in means from sample to sample. 

A theoretical, as distinct from an experimental, approach may be used. 
Given a finite population of eight members, a limited number of different 
samples of four cards exist. The number of such samples is the number of 
combinations of eight things taken four at a time, or С; = 70. Each of 
these 70 samples may be considered equiprobable. The means for the 70” 
possible samples may be ascertained and a frequency distribution pre- 
pared. This frequency distribution is a theoretical sampling distribution. lt 
is obtained by direct reference to probability considerations. No drawing of 
actual samples is involved. The standard deviation of the theoretical 
sampling distribution is a measure of fluctuation in means from sample to 
sample. 

In the above example the population is small and finite. In practice, most 
of the populations with which we deal are indefinitely large, or if finite, they 
are so large that for all practical purposes they can be considered indefi- 
nitely large. In the study of sampling error the approach used in dealing 


with an indefinitely large population is a simple extension of that used with — 


a small finite population. The distinction between an experimental and the- 
oretical sampling distribution still applies. When the population is indefi- 
nitely large, the theoretical sampling distribution of, for example, the mean 
is the frequency distribution of means of the indefinitely large number of 
samples of size N which theoretically could be drawn. 

The theoretical sampling distributions are known for all commonly used 
statistics. The standard deviation of the sampling distribution is called the 
standard error. Thus a standard error is always a standard deviation which 


describes the variability of a statistic over repeated sampling. The stan- 


dard deviation of a theoretical sampling distribution is, in effect, a popula- 
tion parameter. It is descriptive of the variation of a statistic in a complete 
population of sample values. The standard deviation of the theoretical 
sampling distribution of the mean is represented by the symbol øz. In prac- 
tice this standard deviation must be estimated from sample data. This 
estimate in the case of the mean may be represented by the symbol sz. For 
most statistics fairly simple formulas are available for estimating the stan- 
dard deviation of the theoretical sampling distribution. 

The theoretical sampling distributions of some statisties are normal, or 
approximately so; others are not. For example, the theoretical sampling 
distribution of the mean X is normally distributed іп көшіне frám ап 
indefinitely large normally distributed population. еншш в 
tion of the correlation coefficient presents a complicated problem. It is a 


126 


L 
| 


BASIC STATISTICS 


normally distributed except under certain special circumstances. When 
the shape of the sampling distribution is known, certain kinds of state- 
ments can be made about a population value from a sample estimate. For 
example, it is possible to fix limits above and below a sample value and 
assert with a known degree of confidence that the population parameter 
falls within those limits. The fixing of such limits requires a knowledge of 
the shape of the sampling distribution. 


SAMPLING DISTRIBUTION OF MEANS 
FROM A FINITE POPULATION 


In practice, most samples are viewed as drawn from indefinitely large pop- 
ulations. The essential ideas of sampling may, however, be conveniently 
illustrated with reference to a small finite population. Suppose, as men- 
tioned above, that we have a population of eight cards numbered from 1 to 
8. These cards may be shuffled, and a sample of four cards drawn at 
random. After each card is drawn it is not returned; that is, the sampling is 
without replacement. A mean X may be calculated for this sample. The 
four cards may now be returned, the eight cards shuffled, another sample 
of four cards drawn, and another mean calculated, Let us continue this 
procedure until 100 samples of four cards have been drawn and their 
means calculated. Table 9.1, col. 3, shows the frequency distribution of 100 
such sample means. This distribution is an experimental sampling distribu- 
tion of means. It shows experimentally how the means of samples of four 
drawn at random without replacement from a population of eight vary from 
sample to sample. The mean of the experimental sampling distribution Хь 
that is, the mean of the 100 means based on samples of four, is found to be 
4.56. The mean of the population from which the samples have been drawn 
is the mean of the integers from 1 to 8 and is 4.50. The standard deviation 
of the 100 means 5х is found to be .834. 

The investigation of the fluctuation in sample means may be approached 
theoretically. The number of different samples of four in sampling without 
replacement from a population of eight members is the number of combi- 
nations of eight things taken four at a time, or C4? = 70. These 70 samples 
may be considered equiprobable, А listing of the 70 samples may readily be 
made, and the means calculated. The sample with the smallest mean will 
be 1, 2, 3, 4; the mean X = 2.50. The sample with the largest mean will be 
5, 6, 7, 8; here X = 6.50. Thus X will range from 2.50 to 6.50. Table 9.1, col. 
5, shows the frequency distribution of the 70 sample means. This distribu- 
tion is a theoretical sampling distribution of the mean of samples of four 
from a small finite population of eight members. It is based on the idea that 
there are 70 possible combinations of eight things taken four at a time, all 
combinations being equiprobable. 

The mean of the theoretical sampling distribution 
This mean zis found to be 4.50. The standard deviati 
.866. These values do not differ markedly from the 


may be calculated. 
on Ст із found to be 
mean and standard 


ч. DIM 9-7 > Ыы 


Table 9.1 


E distribution distribution 
УХ x Á _ 
f, P f P 
1 2 3 4 5 6 

` 10 2.50 1 .010 1 .014 
1 2.75 2 :020 1 .014 
12 3.00 0 .000 2 .029 
13 3.25 5 .050 3 .043 
14 3.50 7 .070 5 071 
15 3.75 7 070 5 .071 
16 4.00 8 080 7 .100 
17 4.25 ll 110 7 100 
18 4.50 13 130 8 114 
19 4.15 10 100 7 .100 
20 5.00 10 .100 7 .100 
21 5.25 9 .090 5 071 
22 5.50 7 070 5 .071 
23 5.15 4 .040 3 .043 
24 6.00 3 .030 2 .029 
25 6.25 1 010 1 014 
26 6.50 2 020 1 014 


deviation of the experimental sampling distribution, these being Ху = 4.56 
and sz = .834. Presumably, had a larger number of samples been m 
say, 200 or 1,000. the experimental sampling distribution would be 
observed to approximate more closely to the theoretical distribution. $ 
The mean and standard deviation of the theoretical sampling distribu- 
tion of Table 9.1 were calculated directly from the 70 possible sample 
means. These values may, however, be readily obtained without using thi 
time-consuming method. It can be shown that the mean of the theoretic 
sampling distribution is equal to the population mean; that is, иг = д. 
our example the mean of the sampling distribution of samples of four from 
a population of eight members is observed to be 4.50. Likewise, the popula- 
tion mean, that is, the mean of the integers from 1 to 8, is also 4.50. The 


rimental and theoretical sampling distribu- 


Expe 
four drawn from a 


tions of means of samples of 
population of eight members 


Experimental Theoretical 


128 


[9.1] 


э.6 


BASIC STATISTICS 


standard deviation of the theoretical sampling distribution is given by the 
formula 


_ = И 
°F УМУМ, 1 


where o = standard deviation in population 
N, = number of members in population 
N — sample size 


In our example ø is the standard deviation of the integers from 1 to 8 and is 
equal to 2.29. Population and sample size are, respectively, 8 and 4. Hence 


_ 2.29 (== 
ауга 1 
If, then, the standard deviation o of the population is known, we can 
readily obtain from the above formula the standard deviation of the theoret- 
ical sampling distribution and use this as a measure of fluctuation in means 
from sample to sample. 

A knowledge of the standard deviation of a theoretical sampling distribu- 
tion is of limited usefulness unless additional information is available on 
the shape of the distribution. In certain instances sampling distributions 
are normal, or approximately normal in form. The theoretical sampling dis- 
tribution of Table 9.1 departs appreciably from the normal form. If, 
however, both sample and population size were increased, the distribution 
would approximate more closely to the normal form. For N — 30 and 
М, = 100, the normal distribution would be a good approximate fit. If the 
sampling distribution is approximately normal, we can, given its standard 
deviation, readily estimate the probability of obtaining values equal to or 
greater than any given size in random sampling from the population. 


.866 


SAMPLING DISTRIBUTION OF MEANS FROM 
AN INDEFINITELY LARGE POPULATION 


Many populations may be conceptualized as comprised of an indefinitely 
large number of members. Most applications of 


from a finite population with replacement, that is, 
member is returned to the population prior to the 


pulation with replace- 
bilities are unchanged 


А ms of sampling from ап 
indefinitely large population can be approached through the study of finite 


populations where samples are drawn with replacement, 
To illustrate sampling from an indefinitely large Population, an artificial 


SAMPLING 129 


Table 9.2* Population from which samples were drawn: 


frequency distribution of numbers 


Number Frequency Number Frequency 
ди dU o s MOM EAE 


iL 1 14 174 
2 2 15 154 
3 4 16 127 
4 7 17 96 
5 14 18 67 
6 26 19 43 
7 43 20 26 
8 67 21 14 
9 96 22 7 
10 127 23 4 1 
u 154 24 2 
12 174 25 1 
13 181 = 
Total 1,611 


reproduced from В. W. В. Jackson and 
anual of educational statistics, University 
ational Research, Toronto, 1942. 


* Tables 9.2 to 9.4 are 


George А.Ғ on, Ма 
of Toronto, Department of Educ: 


population was constructed. This population was comprised of 1,611 cards 
containing the numbers from 1 to 25. The distribution of numbers is 
approximately normally distributed. The distribution of this population is 
shown in Table 9.2. The mean p of the population is 13, and the standard 
deviation d is 3.56. The cards were inserted in a box, and samples of 10 
cards drawn with replacement; that is, a card was drawn, its number 
noted, and the card then returned to the box before the next draw. 
Altogether 100 samples of 10 cards were drawn, and the 100 means 
calculated. Table 9.3 shows the means of the samples. Table 9.4, col. 2 
shows a' frequency distribution of these means. This distribution is án 
experimental sampling distribution of means based on samples of size 10. 
The mean of the sampling distribution Xzis 13.205, and the standard devia- 
is 1.139. This standard deviation is a description based on experi- 
a of the sample-to-sample fluctuation of means of samples of size 
ndom from this population. 

d standard deviation of the sampling distribution need not 
the rather laborious experimental approach describ: d 
above. It can be shown that the mean of the theoretical sampling distin 
tion of the mean in sampling from an indefinitely large population is e A 
to the population means that is, uz = р. It can also be shown that the d 


tion 5x 
mental dat 
10 drawn at га 

The mean ап 
be estimated by 


130 


Table 9.3 


[9.2] 


Table 9.4 


BASIC STATISTICS 


Means of samples of 10 drawn from the population in table 9.2 

10.9 13.5 11.7 13.3 13.8 12.5 15.0 12.7 14.3 12.7 
13.0 13.2 14.0 13.1 13.2 12.7 12.6 11.5 13.2 12.9 
12.4 13.9 14.1 12.2 13.1 11.7 11.5 14.6 12.6 12.9 
13.9 14.0 11.7 12.1 13.2 13.6 14.4 14.0 12.2 13.7 
12.6 11.6 11.8 12.1 13.1 13.2 12.5 14.0 16.4 12.2 


12.6 13.7 13.6 14.0 12.1 13.2 14.8 13.6 12.5 14.5 
14.4 13.9 13.8 15.1 14.2 14.4 13.5 12.7 14.5 14.4 
12.9 11.3 14.5 13.0 12.0 13.3 12.7 14.8 11.3 11.0 
12.7 14.6 15.2 14.1 16.1 14.7 12.3 11.2 14.3 14.7 
12.9 12.3 11.9 14.0 14.5 12.4 11.9 12.3 12.4 12.6 


dard deviation of the sampling distribution is given by 


о 


oe VN 


where o = standard deviation in population 
N = size of sample 


The reader will observe that the difference between this formula and the 
formula previously given for the standard deviation of the sampling dis- 
tribution for the means of samples from a finite 


absence here of the term V (Np — N)/(N, — 1). 


approaches 1 as a limit. It is equal to 1 when th 


population resides in the 
As N, increases, this term 
e population is indefinitely 


Experimental and theoretical sampling distribu- 
tion of 100 sample means for samples of size 10 
drawn from the population of table 9.2 


Frequency 
Class interval Experimental Theoretical 
16.5-17.4 — a 
15.5-16.4 2 1.4 
14.5-15.4 13 8.4 
13.5-14.4 27 24.6 
12.5-13.4 31 34.3 
11.5-12.4 22 22.8 
10.5-11.4 5 7.2 
9.5-10.4 ВА LI 
Total 100 99.9 


қ 


9.7 


SAMPLING 131 . 


large. The standard deviation of the theoretical sampling distribution is 
oz = 3.56/V10 = 1.126. This is very close to the standard deviation of the ^ 
experimental sampling distribution sz, which was found to be, 1.139. ] 

The theoretical sampling distribution of the means of samples drawn E 
from a normal population is normal. Thus if we know that the population | 
distribution is normal, we know that the sampling distribution of means і | 
normal. Regardless of the shape of the population distribution, the sam- | 
pling distribution of means will approximate the normal form as N А 
increases іп size. For practical purposes the distribution may be taken аз 1 
approximately normal for samples of reasonable size, except in the case of 
fairly gross departures of the population from normality. The theoretical 
normal frequencies have been calculated for our illustrative example. | 
These theoretical frequencies are shown in Table 9.4, col. 3. These are the | 
expected normal frequencies for a normal curve with a mean of 13.00 and a 
standard deviation of 1.139. The differences between the experimental and - 
the theoretical normal distribution are not very great. = 

Examination of the formula оз = olVN indicates that the standard 
error of the mean is directly related to the standard deviation of the popula; 
tion and inversely related to the size of sample. Thus the greater the varia- | 
tion of the variable in the population, the greater the standard error; also 
the larger the size of V, the smaller the standard error. The standard error | 
of means of samples of N = 1, the smallest sample size possible, is equal to | 
the population standard deviation. For any fixed value of с the standard - 
т сап be made as small as we like by increasing the size of the sample. 


erro; 


SAMPLING DISTRIBUTION OF PROPORTIONS ° te 


Many problems require the use of proportions. A study of the sampling dis- 
tribution of a proportion may be approached either experimentally or theo- | 
retically. To illustrate, consider an urn containing a finite number, Np, of — 
black and white chips. Denote the proportion of black and white chips by 0 - 
and 1 — 6, respectively. Let us draw a large number of samples of size Уай | 
random, without replacement, from the urn, observe the proportion of | 
black chips in each sample, and make a frequency distribution of these 1 
proportions. This frequency distribution is an experimental sampling dis- 
tribution of proportions for samples of size N. As with the arithmetic mean, | 
we may use a theoretical, as distinct from an experimental, approach. In - 
drawing samples of М from a finite population of N, members, the number | 
of different equiprobable samples is Cy. The proportion of black chips қ 
тау Бе calculated from each of these samples, and a frequency distribu- 
tion made of the proportions. This distribution is a theoretical sampling dis- 


tribution of proportions. і Я 
For illustrative purposes consider a hypothetical population of 454 


black and three white chips. Denote the members of this population by B, 
sous 98 А 

В» Ва, Wa Ws» Wor the subscript identifying the particular population 

member. We may consider the set of equiprobable samples of three 


132 


[9.3] 


[9.4] 


BASIC STATISTICS 


members which may be drawn without replacement from this population. 
The number of such samples is C; = 20. The first sample is В,В,В;, and 
the proportion of black chips is р = 1.00. The second sample is В.В, 
with p = .67; the third, B,B,W, with p = .67, and so on. А frequency dis- 
tribution may be made of these proportions, and is as follows: 


p 7 JIN 


0 1 05 
.83 9 45 
61 9 45 

1.00 1 05 

Total 20 1.00 


The standard deviation, or standard error, of this theoretical sampling dis- 
tribution may be readily calculated and is с, = .224. This standard error 
may be obtained directly by the formula 


_ ра NN 
TRUUN CONS CO UNS 


р 


In the above example @ = .50 since three of the six chips are black. Popula- 
tion and sample size respectively are Ny = 6 and № = 3. Hence 


[5x3 6-3 _ 
ср = 3 х6 = 24 


which agrees with the value obtained by direct calculation. 

The discussion above relates to sampling without replacement from a 
finite population. As with the arithmetic mean, the term (№, — N)/ 
(№, — 1) in formula [9.3] approaches unity for any finite value of Nas Np 
approaches infinity. Thus this term may be considered equal to unity for an 
indefinitely large population, and we obtain 


as the standard error of a pro; 
population, or in sampling from a finite 


, the expected, or theoretical, 


ack chips in samples of size N, as 
distinct from the proportion of black chips, is given by the terms of the 


binomial [0 + (1—0) ]^. The mean and standard deviation of this distribu- 
tion аге № and V N6(1— 0), respectively. We аге interested in the dis- 
tribution of the proportion, instead of the number, of black chips in the 
samples. To obtain the standard deviation of the distribution of the propor- 
tion, as distinct from the number, of black chips in samples of size N, we 


[9.5] 


[9.6] 


SAMPLING 133 


multiply М №9 — 8) by 1/N to obtain с, = V/8(1— @)/N which is the 
same as formula [9.4]above. To illustrate, let 0 = .25 and 1 — 0 = .75. The 
expected distribution of the number of black chips in samples of size 10 is 
given by expanding the binomial (.25 + .75)'*. The mean in this example is 
10 х .25 2.5, and the standard deviation is У10 X .25 X .75 = 1.37. 
The standard deviation of the distribution of the proportion of black chips 
in samples of size 10 is obtained by dividing 1.37 by 10 and is .137. 

Formulas [9.3] and [9.4] assume that 0 is known. In practice, 0 is 
usually not known and the sample value p is used as an estimate of 6. 


SAMPLING DISTRIBUTION OF DIFFERENCES 


For certain purposes a knowledge of the sampling distribution of the dif- 
ference between two statistics, such as the difference between two arith- 
metic means or two proportions, is required. To conceptualize the sam- 
pling distribution of the difference between, say, two arithmetic means, let 
us consider two indefinitely large populations whose means are equal; that 
is, М: = ра. Let X, be the mean of a sample of N, cases drawn at random 
from the first population and X; be the mean of a sample of N, cases 
drawn from the second population. The difference between means is 
X, — Ху. Since ja = He this difference results from sampling error. A large 
number of pairs of samples may be drawn, and a frequency distribution 
made of the differences. It describes how the differences between means 
chosen at random from two populations, where ш; = д, will vary with 
repeated sampling. From this distribution we may estimate the probability 
of obtaining a difference of any specified size in drawing samples at 
random from populations where ш мә. By considering an indefinitely 
large number of pairs of samples we arrive at the concept of a theoretical 
sampling distribution of differences between sample means. In this situa- 
tion the individual measurements in the two populations are not paired 
with one another. The samples are independent. The means may be viewed 
as paired at random. No correlation exists between the pairs of means. 

The variance of the sampling distribution of differences describes how 
the differences vary with repeated sampling. Consider the case of indepen- 
dent samples. If от? = о? М, is the variance of the sampling distribution 
of means drawn from one population and co? = os?/N, is the corre- 
sponding variance from the other population, then the variance of the 
sampling distribution of differences between means is the sum of the two 


variances. Thus 


а-я 
осто? = on t 


When оу? = 02? = о?, the variances in the two populations being equal, we 


may write 


о-в = а? (% s ГА ) 


134 


[9.7] 


EXERCISES 


BASIC STATISTICS 


Consider now a situation where measurements are paired with one 
another. Such data arise, for example, where measurements are made on 
the same group of subjects under both control and experimental condi- 
tions. The paired measurements may be correlated. In this instance, in 
approaching the sampling distribution of differences between means, we 
conceptualize two populations of paired measurements with equal means; 
thus р! = д. Denote the correlation between the paired measurements by 
the symbol рг. Samples of size N are drawn at random, and the differences 
between means obtained. The distribution of differences between means 
for an indefinitely large number of samples is the sampling distribution of 
differences for correlated populations. 


For correlated populations the variance of the sampling distribution of 
differences may be shown to be 


2- 2 Sian 
Oz,-:, = Or, + Oz! — 2005 07 


where ру» is the correlation in the population. Note that the formula for 
independent samples is a particular case of the more general formula for 
correlated samples. It is the particular case which arises when раз = 0. In 
the correlated case №, = №, = М. 


Formulas [9.5] to [9.7] are simple applications of the formula for the 
variance of differences. 


ee O 


1 Indicate the difference between 
random sample, 
stratified sample, 
tribution. 


(a) a random sample and a stratified 
(6) a stratified random sample and a proportional 
(c) an experimental and a theoretical sampling dis- 


2 How would you proceed to draw a г. 


а : andom sample of 100 university 
students? 


3 How would you proceed to draw a 


Systematic stratified sample of 100 
students from the students ina u 


niversity? 
4 Would a random sample of names sel 


! ected from a telephone book be 
considered appropriate for the study 


of voting behavior? 


distribution of means, (5) the st. 


6 The standard deviation of the sampling distribution of Xi VN. 
What is the standard deviation of t oy У 


ы he sampling distribution of NX or 


ГО Ру ЛЛА. Лау ee 
» 


SAMPLING 135. 


7 А university has a population of 1,000 students. The standard devia- 
tion of scholastic aptitude test scores in this population is 80. Calcu- 
late the standard errors of mean scholastic aptitude test scores for a 
sample of 100 students drawn with and without replacement. ^ 


8 Тһе pages of this book, excluding the Appendix, may be defined asa | 
population. Some pages contain no formulas, some one, some two, - 
and so on. (a) Obtain the population distribution of the number of - 
formulas per page; that is, ascertain the number of pages with no - 
formulas, the number with one, with two, and so on. Count only num- - 
Бегей formulas. (b) Calculate the mean and standard deviation for the 
distribution of (a) above. (c) Draw without replacement five random | 
samples of 30 pages each, using an appropriate random sampling | 
method. Calculate for each sample the mean number of formulas per 
page and the standard deviation of means for the five samples. How - 
does this standard deviation compare with that obtained from formula | 
[9.1]? Repeat (c) using a random sample drawn with replacement, | 
and compare the resulting standard deviation with that obtained from | 


formula [9.2]. 

9 The variance of the sampling distribution of the mean for samples of 
100 cases drawn from an indefinitely large population is 20. How large 
should the samples be to reduce this variance by one-half? How large. 
should the samples be to reduce the standard deviation of the dis- ^ 


tribution by one-half? 


10 А population consists of six black and two white chips. Obtain the | 
frequency distribution of proportions of black chips in samples of four | 
drawn from this population. What is the standard deviation of this dis- 


tribution? А 
11 Willa negative correlation between paired observations increase or 
decrease the standard error of the difference between two means? 


136 


& i 


10.1 


10.2 


ESTIMATION 


INTRODUCTION 


This chapter considers some aspects of the problem of estimating popula- 
tion parameters from sample values. A distinction is commonly made 
between two types of estimates, point estimates and interval estimates. A 
point estimate is the value obtained by direct calculation on the sample 
values. If in a particular sample the mean X = 26.88, this is a point 
estimate of the parameter м. Another approach is to specify an interval 
within which we may assert with some known degree of confidence that the 
population mean lies. Thus, for example, instead of the point estimate X we 
may perform a simple calculation, which will shortly be described, and 
assert with 95 per cent confidence that the population mean falls within the 
limits 24.92 and 28.84. These values are called confidence limits, and the 


interval they contain is called a confidence interval. Such an interval is an 
interval estimate. 


PROPERTIES OF ESTIMATES 


Methods of estimation are sometimes said to yield unbiased, consistent, 
efficient, and sufficient estimates. These are desirable characteristics and 
serve as criteria for preferring one method of estimation to another. 

A method of estimation provides an unbiased estimate when the mean of 
a large number of sample values, obtained by repeated sampling, 
approaches the population value in the limit as the number of samples 
increases. This simply means that a statistic is unbiased when it displays 
no systematic tendency to be either greater than or less than the population 
parameter; that is, it is not subject to a constant error. The arithmetic 


[10.1] Relative efficiency = 15797]N — 


ESTIMATION 137 


A а н = 
mean is an unbiased estimate. The sample mean X exhibits no systematic | 
м its i 
tendency to be either greater than or less than the parameter д. Stated in 
h: h; » inue 


E ope different language, an estimate is unbiased when it 

value is equal to the parameter it purports to estimate. ешш 

The expected value of a statistic is the value we should expect to obtai 
о obtain 


upon averaging the values of the statistic over an indefinitely large b 
number | 


of repeated random samples. It is the у 
Я alue we should expect in i 
the long run. The expected value of a variable is cited Ы y ең Mi x 
each value of the variable by its associated probability and рілген и 
5 the 


products. For example, consider the following frequency distributi ith 
on wi 


corresponding probabilities: 


Ж f p 
es 
2 40 40 
3 50 .50 
4 10 0 
100 1.00 
a 


The expect 
2.70. Note that Е 
values of X by the 


general the expecte 
statistic is a biased estimate when its expected value does not tend toward 
wari 


the population value but departs in systematic fashion from i 
example, it may be shown that the expected value of the varianc: hs 
as s? = (X — X)!IN is not о? but is 0? (М — 1)/N. Thus the cs "m 
so defined, which we would expect to get in the a 
a 


frequency of X and dividing by the total frequency. In 


variance, 


biased estimate of оз. 
A method of estimation is said to yield a consistent estimate if th 
1 hat 


estimate approaches the population parameter more closely as sample si 
increases. The arithmetic mean is a consistent estimate in that ae e size 
draw closer to the population parameter with increase in sar me to 
The efficiency of a method of estimation is related to anple ОМ 
ng 


vari 
the two samplin, 
inator. To illustrate, w 


g variances, the larger variance being placed in the d 
hen the distribution of a variable in the аниа | 
ion is 


symmetri 
the same population parameter, ш. The sampling varian 
2 В ° ce of the М 
2 = о?/М, and of the median, Oman’ = 1. 2 1 mean is 
о. |. тап 570°/N. The relative efficiency is 


then 


№ 
ot 64 


ed value of X, that is, E(X), is 2 X .40+3 X .50 ы 
j > . 50+4х 10= 
(X) is also ће mean of Х obtained by multiplying 54 a 


ance. The relative efficiency of two methods of estimation is the ratio of 
ratio of | 


` 


f 


d value of the mean, E(X), is the population mean pe As 


cal and unimodal, both the mean and the median are estimates of 
ates of | 


138 BASIC STATISTICS 


Thus for this type of population the mean is more efficient than the median. 

Relative efficiency has meaning in terms of sample size. A median calcu- 

lated on a sample of 100 cases has a sampling variance equal to that of a 

mean calculated on 64 cases. The mean is a more economical estimate and 

the saving achieved by its use in preference to the median is 36 cases 
ls in 100. 

А method of estimation is sufficient if it is more efficient than any other 
possible method of estimation, that is, if its sampling variance is less. A 
sufficient method of estimation uses all the information in the sample. In 
this context, the concept. of information is assigned a precise mathematical 
meaning. 


10.3 CONFIDENCE INTERVALS FOR MEANS OF LARGE SAMPLES 


The calculation of confidence intervals for a mean based on a large sample 
is a relatively simple procedure. The sampling variance of the sampling 
distribution of the mean, as previously stated, is о? = o?/N. The popula- 
tion variance 0° is unknown. If we use an unbiased variance s* as an 
estimate of о?, our estimate of the sampling variance is s;? — s*[N and the 
estimated standard error is given by 


[10.2] s;— VN 
This is the commonly used formula for estimatin 
arithmetic mean. х 
Consider now the ratio z= (X — ш) [5=. 
sample mean from its population mean, divided by an estimate of the stan- 
dard deviation of the sampling distribution. It is a standard score. Assum- 


ing the normality of z, it is correct to state that the probability is .95 that the 
following statement is true: 


g the standard error of the 


This ratio is a deviation of a 


00.3] —1.96 « 4 < 1 9g 


This inequality specifies the confiden 
effect it states that the chances are 9; 
X1.96. Ordinarily, however, 


ce interval in standard-score form. In 
5 in 100 that (X — №) [5 = falls between 


я З 0 we are not interested in the confidence 

= interval in standard-score forin, but in raw-score form. To convert the 
к inequality to raw-score form, we multiply by s; and add X to each term to 
4 obtain 


IN [10.4] X — 1.965: < u < X+ 1.965; 


This states that the chances are 95 in 100 th 
Е thus the upper limit is 1.96 standard error 

and the lower limit is 1.96 standard error un 

1.96 derives, as the reader will recall, from 


at u falls between X + 1.9653; 
units above the sample mean, 
its below the mean. The figure 
the fact that 95 per cent of the 


ESTIMATION 139 Ü 


area of the normal curve falls within the limits +1.96 standard deviation 
units from the mean. To illustrate the fixing of confidence intervals, let the | 
mean IQ of a random sample of 100 secondary school children be 114 and | 
the standard deviation 17. The standard deviation here is the square root of - 
the unbiased variance estimate. Our estimate of the standard error of the 
mean is 5; = 17/V 100 = 1.70. The 95 per cent confidence interval is then 
given by 114 + 1.96 Х 1.70. The upper limit is 117.33 and the lower limit is 
110.67. Thus we may assert with 95 per cent confidence that the population | 
mean falls within these limits. The 99 per cent confidence limits are given | 
by X + 2.5855. The figure 2.58 derives from the fact that 99 per cent of the 
area of the normal curve falls within the limits +2.58 standard deviation 
units above and below the mean. In the above example the 99 per cent соп- 
fidence limits are given by 114 + 2.58 х 1.70. These limits are 109.61 anc 
118.39. Е 
What meaning attaches to the statement that we are 95 per cent con- 
fident that the actual population mean falls within certain specific limits? А. 
particular sample may have a mean X = 26.88 with 95 per cent confidence 
intervals 24.92 апа 28.84. Another sample of the same size may have a 
mean X = 25.68 with 95 per cent confidence intervals 23.72 and 27.64. Pre- қ 
sumably we could draw а large number of samples, obtain a large number | 
of upper and lower limits, and prepare frequency distributions of these | 
er limits. These two distributions would be experimental 
butions for the 95 per cent confidence limits. Without | 
elaborating the details of this situation, we state that about 95 per cent of 
the intervals so obtained would include the population mean and about 5 
per cent of the intervals would not include the population mean. Thus Һе 
statement that we аге 95 per cent confident implies that we expect about 95. 
per cent of our assertions to be correct and the remaining 5 per cent to be 
incorrect, or that the odds are 19:1 that the confidence interval includes | 
the population value. The use of a 95 per cent confidence interval is fairly 
common. If a greater degree of confidence is desired, a 99 per cent interval 
may be used. This interval is, very roughly, 1.3 times as great as the 95 per 
cent interval. Thus as we increase our level of confidence, the interval is 
increased. Likewise, of course, as we decrease the level of confidence, the 
interval is decreased. Any desired level of confidence can be obtained Бу. 
varying the size of the confidence interval. As the confidence level is 
decreased and approaches zero, the confidence interval approaches zero as 
a limit. As the confidence level is increased and approaches 100, the con- 
fidence interval approaches infinity as a limit. In practice, 95 and 99 E 
cent confidence intervals are widely used. А 4 
Implicit in the above discussion is the assumption that the ше; 


upper and low 
sampling distri 


(Х— iiie ан торда distributed. This ratio is not normally distributed 
pip vadam аро ше шы! form 25 N increases in size. It 
is a not uncommon statistical convention to consider a sample of m ; 
more observations as large and a sample of less than 30 as small. This, of s 


course, is highly arbitrary. 


140 


10.4 


[10.5] 


Fig. 10.1 


BASIC STATISTICS 
THE DISTRIBUTION OF t 


In drawing samples from a normal population with mean и and variance 
оз, the distribution of the ratio 

Х-и 

Or 

is normal. This ratio is in standard-score form with zero mean and unit 
standard deviation. It is a deviation of a sample mean from a population 
mean, divided by the standard deviation of the sampling distribution of 
means. Where о? is unknown, we estimate it from the sample data, using in 


this instance an unbiased estimate. We obtain thereby an estimate of o>. 
Denote this by sz. We may now consider the ratio 


en D 
E E(X— X) 
N(N — 1) 


This ratio contains the variable sample values Х and sz in the numerator 
and denominator, respectively. This is a t ratio. It departs appreciably from 
the normal form for small N. Its theoretical sampling distribution is called 
the distribution of t. If samples of, say, 5 or 10 members are drawn from a 
normal population, a value of t calculated for each sample, and a frequency 
distribution of the different values of t prepared, the resulting distribution 
will not be normally distributed. It will be symmetrical but leptokurtic. The 
theoretical sampling distribution of t for small М is also symmetrical and 
leptokurtic. It tapers off to infinity at the two extremities. It is, however, 
thicker at the extremities than the corresponding normal curve, A different 
t distribution exists for each number of degrees of freedom. As the number 
of degrees of freedom increases, the ¢ distribution approaches the normal 


form. Figure 10.1 compares the normal distribution with the distribution of 
t for various degrees of freedom. 


f (t), relative frequency 
e 
в 
1 


oleae Е _ 
m -3 -2 E о +1 


Distribution of t for various degrees of freedom. (From D. 


ewis, Quantitative 
methods in psychology, McGraw-Hill Book Company, New York, 1960.) 


ESTIMATION 141 


Hitherto we have considered two theoretical model frequency distribu- 
tions, the binomial distribution and the normal distribution. The ¢ distribu- 
tion is a third theoretical model distribution with wide application to many 
sampling problems. It was developed originally in 1908 by W. S. Gosset , 
who wrote under the pen name "Student." x 

In sampling problems the ¢ distribution is used in a manner directly anal- 
ogous to the normal distribution. In the normal distribution 95 per cent of 
the total area under the curve falls within plus and minus 1.96 standard 
deviation units from the mean and 5 per cent of the area falls outside these 
limits. Likewise, 99 per cent of the area under the normal curve falls within 
plus and minus 2.58 standard deviation units from the mean and 1 per cent | 
of the area falls outside these limits. In the ¢ distribution, the distances 
along the base line of the curve that include 95 per cent and 99 per cent of 
the total area are different for different numbers of degrees of freedom. It is 
customary in tabulating areas under the t curve to use degrees of freedom, 
df, instead of N. While the df associated with the sample variance is N — 1, 
the df associated with other statistics may be N — 2, N — 3, and the like. 
Consequently. tables of t by degrees of freedom instead of М are more gen- 
erally applicable. The distances from the mean, measured along the base 
line of the ¢ distribution, that include 95 per cent and 99 per cent of the 
total area (analogous to the 1.96 and 2.58 of the normal distribution) for 


selected degrees of freedom are as follows: 


„——— 
1 12.71 63.66 
2 4.30 9.93 
3 3.18 5.84 


as the number of degrees of freedom approaches infinity, ¢ 
approaches the values 1.96 and 2.58. The difference between t for about 30 
degrees of freedom and ¢ for an indefinitely large number of degrees of 
freedom is sometimes interpreted for practical purposes as trivial A ined 


Note that 


142 


10.5 


BASIC STATISTICS 


complete tabulation of ¢ is given in Table B of the Appendix. A distinction 
is often made between large and small sample statistics. This distinction 
resides in the fact that the normal distribution is frequently found to be an 
appropriate model for use with sampling problems involving large samples. 
With small samples the distribution of t provides for many statistics a more 
appropriate model. 


DEGREES OF FREEDOM 


In the above discussion on the distribution of t, mention is made of the 
number of degrees of freedom. This concept was discussed also in Chap. 4, 
where the sample variance was defined as the sum of squares of deviations 
about the arithmetic mean, divided by the number of degrees of freedom. 
The degree of freedom concept requires further elaboration. 

As stated, and illustrated in Chap. 4, the number of degrees of freedom 
is the number of values of the variable that are free to vary. The measure- 
ments 10, 14,6, 5, and 5, when represented as deviations from a mean of 8 
become +2, +6, —2, —3, —3. The sum of these deviations is zero. In 
consequence, if any four deviations are known, the remaining deviation is 
determined. The number of degrees of freedom is 4. 

This type of situation may be represented in symbolic form. Let X,, 
X; be three measurements with mean X. The sum of deviations is (X, — 
X) + (65 — X) + (4 — X) 20. If X and any two of the values of X are 
known, the third value of X is determined. The number of degrees of 


requires the sum of squares of deviations about the mean, У(Х — X)? 
N — 1 of the values of which this sum of squares is comprised 


vary independently. The number of degrees of freedom associated with the 
sum of squares is N — 1. Dividing this sum of s 


2. If there are two points only, a straight 
of squares of deviations about 


à om of variation is possible. With 
three points df — 1; with 15 points, df — 13. Thee 


ESTIMATION 143 


freedom. A point on a plane has freedom of movement in two dimensions 
and has 2 degrees of freedom. A point in a space of three dimensions has 3 
degrees of freedom. Likewise, a point in a space of А dimensions has А 
degrees of freedom. It has freedom of movement in А dimensions. 4 
Тһе concept of degrees of freedom is widely used in statistical work and 
will be discussed subsequently in connection with contingency tables and 
the analysis of variance. The essence of the idea is simple. The number of 
degrees of freedom is always the number of values that are free to vary, 
given the number of restrictions imposed upon the data. It seems intui- 
tively obvious that in the study of variation we should concern ourselves | 
with the number of values that enjoy freedom to vary within the restrictions | 


of the problem situation. 


10.6 CONFIDENCE INTERVALS OF MEANS FOR SMALL SAMPLES 


The line of reasoning used in determining confidence intervals for small | 
samples is similar to that for large samples. With small samples, however, 4 
the distribution of ¢ is used instead of the normal distribution in fixing the | 
limits of the interval. For large samples the 95 and 99 per cent confidence " 
intervals for the mean are given, respectively, by X + 1.96s; and X-- | 
2.58s;. For small samples an unbiased estimate of о? is used in estimating | 
the standard error. The value of t used in fixing the limits of the 95 and 99 / 


per cent intervals will vary, depending on the number of degrees of 
freedom. Consider an example where X = 24.26, s? = 64, N= 16, апі. 
df = 16 — 1. On reference to Table B of the Appendix we observe that for М 
15 degrees of freedom 95 per cent of the area of the distribution falls within К 
a t of +2.13 from the mean. The standard error using the unbiased variance | 
estimate is 8/V/16. The 95 per cent confidence limits are given by 24.26 + 
2.13 x 8/V 16. These limits аге 19.88 and 28.64. We may assert with 95 per 
cent confidence that the population mean falls within these limits. The 99 
per cent limits are given by 24.26 + 2.95 x 8/V/15. These limits are 18.16 


and 30.36. s 


10.7 STANDARD ERRORS AND CONFIDENCE 
INTERVALS OF PROPORTIONS 


“The estimate of the standard error of a proportion is given by 


p(l—p) _ |ра 4 
пов) »= ү SO VN 


where 1 — р = q. Also, it may be readily shown that the standard error of a 
per cent is given by 


[ра 
[10.7] sp = 100 у. 


144 


v 


10.8 


[10.8] 


[10.9] 


BASIC STATISTICS 


If it can be assumed that the sampling distribution of a proportion can be 
approximately represented by a normal distribution, then the 95 and 99 per 
cent confidence limits for a proportion are given by p + 1.96s, and p+ 
2.585», respectively. Whether or not the sampling distribution can be 
represented by a normal distribution depends both on the size of the 
sample and on the value of p. For any given value of N the sampling dis- 
tribution of a proportion becomes increasingly skewed as p and q depart 
from .50. Quite clearly, the formula for the standard error of a proportion 
should not be used with reference to a normal curve for extreme values of p 
and q. It has been suggested that the formula for the standard error of a 
proportion should be used only when Np or Nq, whichever is the smaller, is 
equal to or greater than 5. Thus when p —.10 and N = 20, Np = 2. The use 
of the formula s, = Vpq/N would be considered inappropriate here. When 
р = .10 and М = 100, Np = 10. Presumably, here the differences between 


the binomial and the normal distribution are quite small and can safely be 
ignored. 


STANDARD ERRORS AND CONFIDENCE 
INTERVALS OF OTHER STATISTICS 


The standard error of the median may be estimated by 


1.253s 
Smdn — VN 


- His method is described by Kenne d 
c y an 
Keeping (1954) and Johnson (1949). Given N observations arranged in 


- > Ху, the median is the middle value. The 


The 95 and 99 per cent confidence limits can readily be obtained by taking 


UN 


EXERCISES 


ESTIMATION 145 


s = 1.965, and s + 2.58s,, respectively. In using this formula a sample sub. 
stantially greater than 30 should be regarded as large. The method ЕЗ į 
mining confidence limits for s based on small samples, and indeed tha 
method which is perhaps most appropriate in all cases regardless of size t 
N involves a knowledge of the distribution of chi square, or x? For 
simple discussion of this method see Freund (1967) or Johnson (1949) The 
application of x? to a variety of statistical problems will be discussed in 


Chap. 13. 


Using a large-sample procedure, obtain the 95 and 99 per cent con- 


1 
fidence intervals for a mean of 105, where № = 100 and s = 10. Obtain 
also the 75 and 85 per cent confidence intervals. 

2 А random sample of 400 observations has a mean of 50 and a standard 


deviation of 18. Estimate the 95 and 99 per cent confidence limits for 


the mean. 
How is the standard error of the mean affected by tripling sample size? 


Using a large-sample procedure, estimate for the following data the 95 
and 99 per cent confidence intervals for means: 


X М  XX-Xy 
ERE 
а 262 7 77.0 
b 583 1 249.0 
с 463 25 1,525.0 
d 84 16 444.7 
br EAR Le er 


5 Find the value of ¢ for df = 20 such that the proportion of the area (a) to 


the right of t is .025, (b) to the left of t is .0005, (c) between the mean and 


t is .45, (d) between +t is .90. 

Obtain the values required in (a), (0), (c), and (d) of Exercise 5 above for 

df 5. 

т What proportion of the area of the ¢ distribution falls (a) above 
‚ = 3.169 where df= 10. (0) below t ——1.725 where df— 20, (с) 
between # = 53.659 where df—29, (d) between г= 2.131 and 
t = 2.602 where df= 15, (e) between t = —4.541 and t = 3.182 where 


df= 3? 
Estimate the 95 and 99 per cent confidence limits for р = .15 wher 
* е 


N= 169, 


146 


TESTS OF SIGNIFICANCE: 
MEANS 


INTRODUCTION 


In Chaps. 9 and 10 we considered the sampling error associated with single 


standard errors of single sample 
discussed. In practical statistical 


an experimental group of subjects and a placebo, 
the drug, to a control group and measure the е 


icance. 
Tests of significance may be a 
calculated on independent sam 


4 
5; 
” 


[11.1] 


Ру 


coefficient is significantly different from zero. In this case the fixed value i 
zero. While many tests of significance involve a comparison of two sa x. 15 
statistics, or а single sample statistic and a fixed value, such oe 2 
readily be extended to cover situations where more than two sample he 5 
tistics are involved. For example, in the experiment mentioned above: г 2 
the effects of a drug on time perception, the experiment could be desi oil 
to include the administration of the drug in different dosages to анасы 
groups of subjects. Three or four or five different dosages might be used 
resulting in three or four or five different means. The means could be con 
pared two at a time to ascertain whether or not the differences between 
them could be attributed to sampling error. A more efficient form of analy- 
sis, the analysis of variance, provides a procedure for making an verti Ў 
b 


test in this type of situation. 


е; 


THE NULL HYPOTHESIS 
Consider an experiment using an experimental and a control group. A 
treatment is applied to the experimental group. The treatment is absent fom 
the control group. Measurements are made on both groups. Presuna bi 
any significant difference between the two groups can be ascribed wit 
confidence to the treatment and to no other cause. Let X, and X, be iu 
means for the experimental and the control group, respectively. Both | 
means are subject to sampling error. The means X, and X, are estimates of 
the population means р and мә. The trial hypothesis may be formulated 


that no difference exists between и and и». This hypothesis is a null 


hypothesis and may be written 


Нуш — № = 0 
The symbol H, represents the null hypothesis. Very simply, this hypothesis 
no difference exists between the two population means. Note | 
that the statement Ш; — Ms = Ois the same as ш, = м». Thus an alternative 
formulation of the hypothesis is to assert that the two samples are drawn 4 
from populations having the same mean. In general, regardless of the pa | 
ticular statistics used, the null hypothesis is a trial hypothesis dec г- 
that no difference exists between population parameters. Thus a M 
hypothesis about two variances would take the form Hy:0,? — 0,2 = 0, or 
Hoo? = ог. 2. 4 1 ног. 
logical steps used by an investigator in applyin қ” 
Тһе log езе. Fist he assumes the null Куре н: ma es signifi- | 
on the trial hypothesis that the treatment applied will has du 2 
Second, һе examines the empirical data. Where the hypothesis les effect, , 
two means һе examines the difference between the two means сал i 
Third, the question is asked, what is the probability of obtaini 7 Xa Е 
ence equal to or greater than the one observed in a ng a dif. 


asserts that 


cance are th 


fer amples ail 


148 


BASIC STATISTICS 


random from populations where the null hypothesis is assumed to be true? 
In the case of two means, what is the probability of obtaining a difference 
equal to or greater than X, — X, in drawing random samples from popula- 
tions where и — us = 0? Fourth, if this probability is small, the observed 
result being highly improbable on the basis of the null hypothesis, the 
investigator may be prepared to reject the null hypothesis. This means that 
the observed difference cannot reasonably be explained by sampling error 
and presumably may be attributed to the treatment applied. Thus the 
result may be said to be significant. If this probability cannot be considered 
small and the observed result is not highly improbable, then sampling error 
may account for the difference observed. Hence we cannot with confidence 
infer that the difference results from the treatment applied. 

In the testing of any statistical hypothesis, it is necessary to specify an 
alternative hypothesis. This alternative is accepted if the initial hypothesis 
is rejected. Thus in the testing of the hypothesis Ho: — u = 0, the alter- 
native may be Hy: — м» # 0. Under certain circumstances, as described 
in detail in Sec. 11.5, some advantages may attach to a test of a null 
hypothesis against the alternative Hi — Иә > 0, or the alternative pa 


Hz < 0. The alternative hypothesis under consideration should be clearly 
recognized. 


TWO TYPES OF ERROR 


In reaching a decision about the null hypothesis Ну, two types of error may 
arise. An alternative Н, may be accepted when the null hypothesis Hy is 
true. This is called a Type I error, The null Hy may be accepted when an 
alternative hypothesis H, is true. This is called a Type П error. The proba- 


bilities of committing Type I and Type II errors are represented, respec- 
tively, by а and В. 


The situation may be represented as follows: 


Hois true Н, is true 


со 
Accept H, id 


decision 


а H correct 
ccept m 
HESS decision 


When we accept Н,, and H, is true, this is a correct decision about 
nature. When we accept H,, and Но із true, this is also а correct decision 
about nature. The acceptance of. H, when Н, is true and the acceptance of 
Но when H, is true are both errors. Both are incorrect decisions about 
nature. 

The situation above is somewhat analogous to that which arises in the 


wie 


TESTS OF SIGNIFICANCE: MEANS 149 


acceptance or rejection of applicants for employment or admission to a uni- 
versity. For example, a university admissions officer may accept applicants 
who subsequently are successful in the university, or he may reject 
applicants who would have failed if they had been accepted. In either case 
a correct decision may be said to have been made. On the other hand, the 
admissions officer may reject an applicant who would have been successful 
if he had been accepted, or he may accept an applicant who subsequently | 
fails. In both instances an error has been made. | 


LEVELS OF SIGNIFICANCE 


The probability of Type 1, or о, error is called the level of significance of a 
test. Ordinarily the investigator adopts, perhaps rather arbitrarily, a partic- 
ular leve! of significance. It is a common convention to adopt levels of sig- 
nificance of either .05 or .01. If the probability is equal to or less than .05 of 
asserting that there is a difference between two means, for example, when 
no such difference exists, then the difference is said to be significant at the | 
.05, or 5 per cent, level or less. Here the chances are 5 in 100, or less, that 
the difference could result when the treatment applied is having no effect. | 
If the probability is .01, or less, the difference is said to be significant at the 
.01, or 1 per cent, level. The .05 and .01 probability levels are descriptive of 
our degree of confidence that a real difference exists, or that the observed 
difference is not due to the caprice of sampling. Usually in evaluating an қ 
experimental result, it is unnecessary to determine the probabilities with а. 
high degree of accuracy. For most practical purposes it is sufficient to des- 
ignate the probability as р < .05, огр = .01, or possibly p < .001 if the 
result is highly significant. 

Decisions regarding the rejection, or otherwise, of the null hypothesis at 
a level of significance a are commonly made without reference to Type II, 
or В, error. А detailed description of the functioning of 8 error is beyond 
the scope of this book. А few comments, however, may be appropriate. 
Consider, for example, the difference between two means, ші and шм. For 
any particular value of a, say .05, the value of Bis a function of sample size 
N and the actual difference between ш; апа д. For specified а and №, the | 
value of В, which is the probability of failing to recognize the existence of a . 
difference when such a difference exists, will decrease with increase in the 
difference between дл and ps. This means that the larger the difference 
between ш, and fy the less likely we are to accept H,. For a specified dif- 
ference between д and шз and a specified № the value of В will increase as 
а decreases. Accordingly, if too strict a level of significance is adopted, we 
may fail to reject the null hypothesis when in fact a fairly large difference 
between ш, and иг exists. For any specified difference between u, and и» 
and апу а, the Туре П error 8 is a function of sample size №. The smaller 
the sample the greater the value of В. This means that although a large dif- 
ference may exist between p, and pi, it may be difficult to prove for small 


samples. 
Although, quite clearly, failure to reject the null hypothesis does not 


150 


BASIC STATISTICS 


imply that the null hypothesis is true, many investigators exhibit an inclina- 
tion to conclude, even for quite small samples, that no difference, or a 
trivial difference, exists when a required level of significance is not 
achieved. Our discussion of Type II error clearly indicates that such 
conclusions are unwarranted. It would, of course, be possible to establish 
dual criteria such that if œ were greater than, say, .05 and В were less than, 
say, .05, the investigator, in practice, might be allowed to conclude that the 
difference was nonexistent or inconsequential. 


DIRECTIONAL AND NONDIRECTIONAL TESTS 


An investigator may wish to test the null hypothesis, Нор — иг = 0, 
against the alternative, Ну: — м» # 0. This means that if Н, is rejected, 
the decision is that a difference exists between the two means. No asser- 
tion is made about the direction of the difference. Such a test is a nondirec- 
tional test. A test of this kind is sometimes called a two-tailed or two-sided 
test, because if the normal distribution, or the distribution of t, is used, the 
two tails, or the two sides, of the distribution are employed in the estima- 
tion of probabilities. Consider a 5 per cent significance level. If the 
sampling distribution is normal, 2.5 per cent of the area of the curve falls to 
the right of 1.96 standard deviation units above the mean, and 2.5 per cent 
falls to the left of 1.96 standard deviation units below the mean. The area 
outside these limits is 5 per cent of the total area under the curve. Under 
the null hypothesis the chances are 2.5 in 100 of getting a difference of 1.96 
standard deviation units in one direction because of chance factors alone, 
and 2.5 in 100 in the other direction. Hence the chances in either direction 
are 5 in 100. Thus for significance at the 5 per cent level for a nondirec- 
tional test, when the sampling distribution is normal, the observed dif- 
ference must be equal to or greater than 1.96 times the standard deviation 
of the sampling distribution of differences, For significance at the 1 per 
‘cent level a value of 2.58 is required for a nondirectional test. A nondirec- 
tional test is appropriate if concern is with the absolute magnitude of the 
difference, that is, with the difference regardless of sign. 

Under certain circumstances we may wish to make a decision about the 
direction of the diffetence. It has been argued that few instances exist in 
scientific work where the traditional nondirectional test is of interest. If 
concern is with the direction of the difference we may test the hypothesis 
Hoa — из < 0 against the alternative Hy: — м» > 0, or the hypothesis 
Hoa — иг > 0 against the alternative Нуш — u < 0. Note that the 
symbol Но has been used to denote three different hypotheses: an 
hypothesis of no difference, an hypothesis of equal to or less than, and an 
hypothesis of equal to or greater than. Conventionally the term null 
hypothesis has been restricted to an hypothesis of no difference, It is not 
inappropriate, as pointed out by Kaiser (1960), to extend the meaning of the 
null hypothesis to include hypotheses of equal to or less than and equal to 
or greater than. Such tests are directional one-sided, or one-tailed, tests. If 


[11.2] 


TESTS OF SIGNIFICANCE: MEANS к 


^ 


the normal, or t, distribution is used, one side or one tail only is employed. 
to estimate the required probabilities. To reject Но ші — из < О and accept | 
Ним — из > 0, using the normal distribution, a normal deviate > +1.6415 | 
required for significance at the .05 level. Likewise to reject Нуш — из > 0 x 
and accept Н»: — ps < 0, the corresponding normal deviate is < —1.64. | 
The figure 1.64 derives from the fact that for a normal distribution 5 рег. 
cent of the area of the curve falls beyond +1.64 standard deviation units | 
above the mean, and 5 per cent beyond —1.64 standard deviation units i 
below the mean. For significance at the .01 level for a directional test, - 
normal deviates > +2.33 or < —2.33 are required. 

Kaiser (1960) discusses a directional two-sided test, which in the end | 
result uses much the same procedure as the one described above. His | 
procedure requires a decision between three hypotheses, Нуш — p2 = 0, | 
Нуш — из > 0, and Hoa — pa < 0. The rules here, at the 5 per cent sig- 
nificance level, are that we decide upon Но when the normal deviate falls | 
between —1.64 and +1.64. We decide upon H, when the normal deviate is | 


.64 and upon Н» when the normal deviate is less than | 
s making two . 


» 


greater than +1 
—1.64. This test amounts in practice to the same thing a: 
directional one-sided tests. Involved in its development are errors of the | 
third kind. These relate to a decision about a difference in the wrong direc- 
tion. у 
When is it appropriate to use а directional as distinct from a nondirec- 
tional test? This question is open to some controversy. Clearly there аге | 
many instances in research where the direction of the differences is of sub- - 
stantial interest; indeed, it has been argued that there are few, if any, 
instances where the direction is not of interest. At any rate it is the opinion - 


of this writer that directional tests should be used more frequently. 
" 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN f- 
TWO MEANS FOR INDEPENDENT SAMPLES » К; 
Let X, and X; be two 


E 
sample means based on М, and №, cases, respec- 1 
tively. We proceed by com 


bining the data for the two samples to obtain the 3 
best unbiased estimate of the population variance. This estimate is Ж 
obtained by adding together the two sums of squares of deviations about у 
the two sample means and dividing this by the total number of degrees onm 
freedom. This unbiased estimate of the population variance may be written | 


as 
x 
2 


ў к-У XH { 
NE eee 
р" М, +N2— 2 : 


5? 
terms in the numerator are sums of squares of deviations about 
two samples of №, and № cases, respectively. The total 
s of freedom on which s? is based is N, +N: — 2. We lose 
because deviations are taken separately about the 


The tw: 
the means of the 
number of degree 
two degrees of freedom 


152 


[11.3] 


[11.4] 


[11.5] 


BASIC STATISTICS 


means of the two samples. The unbiased variance estimate s? is used to 
obtain an estimate of the standard error of the difference between the two 
means. This standard error is given by 


The difference between means, X, — Хы is then divided by this estimate of 
the standard error to obtain the ratio 
X,— X, Х,- Х, 

52-2, VEIN, + s?/No 


t 


This ratio has a distribution of ¢ with N, +N, — 2 degrees of freedom. The 
values of t required for significance at the .05 and .01 levels will vary. 
depending on the number of degrees of freedom, and may be obtained by 
consulting Table B of the Appendix. 

The formula given above for 5? is not very convenient from a computa- 
tional viewpoint. A more convenient formula is 


Ni № 2 N2 N2 2 
хх - (5х) Iv, «Y x (5х) ІМ, 
© М+М, —2 
This, if desired, may be further modified by writing 
м \? 
(xz) 
N, 


Let the following be error scores obtained for two groups of experimental 
animals in running a maze under different experimental conditions 


2 


5 


= МХ? and —{— = NX? 


Group А 16 9 4 23 19 10 5 2 
Group B 20 5 1 16 2 4 


The following statistics are calculated from these data: 


Group А Group В 


N 8 6 
=X 88 48 
x 11 8 
УХ? 1,372 702 


The unbiased estimate of variance is 


1,372 — 882/8 + 102 — 487/6 
E 107—0 


2 


60.17 


11.7 


TESTS OF SIGNIFICANCE: MEANS 153 | 


The t ratio is then 


t= a ae 179 
V/60.17/8 + 60.17/6 


The number of degrees of freedom in this example is 8 +6 — 2 = 12. For 12 _ 
degrees of freedom a t equal to 2.179 is required for significance at the .05 
level. In this example the difference between means is not significant. No 
adequate grounds exist for rejecting the null hypothesis. We are not jus- - 
tified in drawing the inference from these data that the two experimental 
conditions are exerting a differential effect on the behavior of the animals. 

The t test described here assumes that the distributions of the variables | 
in the populations from which the samples are drawn are normal. It | 
assumes also that these populations have equal variances. This latter con- 
dition is referred to as homogeneity of variance. The t test should be used | | 
only when there is reason to believe that the population distributions do not 
depart too grossly from the normal form and the population variances do 
not differ markedly from equality. Tests of normality and homogeneity of - 
variance may be applied, but these tests are not very sensitive for small 
samples. 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN 
TWO MEANS FOR CORRELATED SAMPLES 


Consider a situation where a single group of subjects is studied under two 
separate experimental conditions. The data may, for example, be auto- 
nomic response measures under stress and nonstress or measures of motor | 
performance in the presence or absence of a drug. The data are composed 
of pairs of measurements. These may be correlated. This circumstance | 
leads to a test of significance between means different from that for 
independent samples. A procedure for testing significance may be applied 
without actually computing the correlation coefficient between the paired 
observations. This method is sometimes called the difference method. Its 
nature is simply described. Given a set of N paired observations, the dif- 
ference between each pair may be obtained. Denote any pair of observa- - 
tions by X, and X; and the difference between any pair X, — X, by D. The 
mean difference over all pairs is (ED)/N = D. It is readily observed that 
the difference between the means of the two groups of observations is 
equal to the mean difference. The difference between any pair of observa- 
tions is X, — X; = D. Summing over N pairs yields УХ, — УХ, = XD. 
Dividing by №, we obtain X; — X; = D. Since the mean difference is the dif- ` 
ference between the two means, we may test for the significance of the dif- 
ferences between means by testing whether or not Д is significantly dif- 
ferent from zero. Here in effect we treat the D's as a variable and test the 
difference between the mean of this variable and zero. 
Ап unbiased estimate of the variance of the D's is given by 


154 


[11.6] 


[11.7] 


[11.8] 


[11.9] 


BASIC STATISTICS 


, EOD-Dy 
an с 


where N is the number of paired observations. Using this unbiased 
estimate, the sampling variance of D is given by 


2 


2 — Sp 
wy 


To test whether D is significantly different from zero, we divide D by its 
standard error to obtain 


D 
t 
5р 


The number of degrees of freedom used in evaluating t is one less than the 
number of pairs of observations, or V — 1. The reader should note that the 
D in the numerator of the above formula is in effect D — 0, which is of 
course D. This test is concerned with the significance of D from zero. 

The above formula for t is not convenient computationally, A more con- 
venient computational formula is 


- XD 
v [NZD* — (2D)?]/(N — 1) 


The data below are those obtained for a group of 10 subjects on a 
choice-reaction-time experiment under stress and nonstress conditions, 
the stress agent being electric shock. The figures are the number of false 
reactions over a series of trials. The problem here is to test whether the 


Stress Nonstress 
Subject X, X. D D? 
rl Dr Ce CE OVER 
1 7 5 2 4 
2 9 15 —6 36 
E 4 7 -3 9 
алы 15 u 4 16 
5 6 4 2 4 
D 3 7 —4 16 
7 9 8 1 1 
E 5 10 -5 25 
9 6 6 0 0 
0 12 16 Г 15 
Sum 76 89 -13 127 
Меап 7.60 8.90 —1.30 


кез 


TESTS OF SIGNIFICANCE: MEANS 155 


means under the two conditions are significantly different. These means. 
are 7.60 and 8.90. The difference between them is equal to the mean of the 
differences, or —1.30. The sum and sum of squares of D are, respectively, 
—13 and 127. Hence 


t zit ——1.18 
УПО х 127 — (—13)7]/(10 — 1) 


We may ignore the negative sign of t and consider only its absolute magni- 
tude. The number of degrees of freedom associated with this value of tis 9. 
For 9 degrees of freedom we require a t of 2.262 for significance at the 5 per 
cent level. The observed value of t is well below this, and the difference | 
between means is not significant. We cannot justifiably argue from these 
data that the mean number of false reactions under the two conditions is- 
different. ] 

The method described above takes into account the correlation between. 
the paired measurements. This results because the variance of differences 
is related to the correlation between the paired measurements by the 


formula 

[11.10] sp? = si? + 52° — 27125152 
When s;, 52, and гу; have been computed, as will frequently be the case, the 
variance of differences sp? can be readily obtained from the above formula 


and need not be obtained by direct calculation on the differences them- 
selves. A positive correlation between the paired measurements will 


reduce the size of sp* and sp 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN MEANS 
WHERE POPULATION VARIANCES ARE UNEQUAL 
The : test for the significance of the difference between means assumes 
equality of the population variances. Where the assumption of equality of 
variance is untenable, the ordinary t test should not be applied. Approxi- 
mate methods for use where the variances are unequal have been 
suggested by Cochran and Cox (1950) and by Welch (1938). The method of 
Cochran and Cox makes an adjustment in the value of t required for signifi- 
cance at the 5 or 1 per cent level, or other critical level as may be required. 
The method proposed by Welch makes an adjustment in the number of 
degrees of freedom. Jj 
To use the Cochran and Cox method we proceed by calculating the stan- 
dard error of the differences between the two means, using the formula 


S(x— Xi)? У(Х — Xe)? к. 
man sss Ума + NaN, 1) “Узе +5 | 
4 


The difference between the sample means is then divided by the standard 
error of the difference to obtain à 


к я 


uo з 


шыма за 


[11.12] 


^ 
а 


BASIC STATISTICS 


One sample is based on N, cases with N, — 1 degrees of freedom, the other 
on №, cases with №, — 1 degrees of freedom. Assume that a two-tailed test 
at the 5 per cent level is appropriate. Refer to a table of { and obtain the 
critical value of ¢ required for significance at the 5 per cent level with 
N, — 1 degrees of freedom. Obtain also the value of ¢ required with М„— 1 
degrees of freedom. Denote these two values of t as t, and te. The approxi- 
mate value of t required for significance at the 5 per cent level is given by 
the formula 


ЖСН zt 
Los = 


552 + 522 


The value of t obtained by dividing the difference between means by the 
standard error of their difference must be equal to or greater than Los 
before significance at the 5 per cent level can be claimed. 

Consider the following data: 


Sample A Sample B 

№, = 13 № =9 
Х, = 26.99 X, = 15.10 
У(Х — X,)? = 1,128 У(Х — X,)? = 1,269 
sz? = 7.23 522 = 17.62 


The standard error of the difference between means is 


1,128 1,269 
жей, Eme. 154 =V72+4 11.6 = " 
5n- 1303 7 +909 D 7.23 + 17.62 = 4.98 


Divide this into the difference between means to obtain 


_ 26.99 — 15.10 
— 4.98 


For 13—1=12and9—] = 8 degrees of freedom, the values of t required 
for significance at the 5 per cent level are, respectively, 2.179 and 2.306. 
The value required for significance at the 5 per cent level in testing the sig- 
nificance of the difference between means is then 


t = 2.39 


_ 7.23 X 2.179 + 17.62 x 2.306 
= 297220501102 Х 2.306 _ 


Los 7.23 + 17.62 ea 


This value 2.27 is less than the obtained value 2.39, Consequently we may 


between means is significant at the 5 per cent 


Another approximate method 
calcuiation of a t value as above 
by their standard error. We then 


proposed by Welch (1947) requires the 
by dividing the difference between means 
refer this value to the table of t using the 


[11.13] 


РЧ ҮШ eee a y SRI TESTOR 


TESTS OF SIGNIFICANCE: MEANS 157 
following formula for the number of degrees of freedom: 


(sz? + sez) 


df — TW, +1) + (зе, ЗУ; D) ^ 


Applying this formula to the previous data we obtain 


а= (7.23 + 17.62)? 
~~ 1.23714 + 17.62?/10 


The value of df will seldom be a whole number. If df is taken аз 16, the 
value of t required for significance at the 5 per cent level is 2.12. If df i 
taken аз 15, the value is 2.13. In either case the observed value of t, 2.39 
exceeds the value required for significance at the 5 per cent level and we 
may conclude that the difference between means is significant. This гез! 
is in agreement with that obtained using the Cochran and Cox procedu 
The above procedures are approximate. For a more accurate method th 
reader is referred to Welch (1947) and Aspen (1949). The latter author has 
prepared tables which assist the comparison of means involving two 
variances, separately estimated. The problem has also been discussed by 
Gronow (1951). j 


—2 = 15.76 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN MEANS 
WHEN THE POPULATION DISTRIBUTIONS ARE NOT NORMAL 


The t test for the significance of the difference between means assumes. 
normality of the distributions of the variables in the populations from 
which the samples are drawn. Where the variables are not normally distri 
uted, what effect will this have on the probabilities, and significance level 
as estimated from the distribution of £? 
Under certain conditions the sampling distribution of means of size 
where N is large, is closely approximated by the normal distribution. Thi: 
result holds regardless of the shape of the distribution in the population 
from which the samples are drawn. The closeness of the approximat on 
improves as N becomes increasingly large. The implication of this is that 
for large samples the nonnormality of the populations will not serioush 
affect the estimation of probabilities, except perhaps in cases of ve y 
extreme skewness. 4 
A number of investigators have studied the effect of nonnormal popula 
tions on the ¢ test for small samples. The empirical evidence suggests that 
even for quite small samples, say, of the order of 5 or 10, reasonably large 
depártures from normality will not seriously affect the estimation of proba 
bilities for a two-tailed г test. А one-tailed г test is, however, apparentl 
more seriously affected by nonnormality. A 
Where the data show fairly gross departures from normality it is proba- 
bly advisable to use nonparametric, or distribution-free, methods. The 
methods provide tests which are independent of the shapes of the distribu 
tions in the populations from which the samples are drawn. They deal vis 


158 


EXERCISES 


Mm LL AT" 


BASIC STATISTICS 


the ordinal or sign properties of the data. A number of such tests are 
described in Chaps. 21 and 22 of this book. Nonparametric methods are 
being used with increasing frequency in psychological research. 


———— 2. 


1 The following are data for two samples of subjects under two experi- 
mental conditions: 


Sample A 2 5 Ж 9 6 Т 
Sample В 4 16 п 9 8 


Test the significance of the difference 


e between means using a non- 
directional test. 


2 The following are data for two independent samples: 


Sample А Sample B 


x 124 120 
N 50 36 
У(Х Х)° 5,512 5,184 


Test whether the mean for sam 


ple A is equal to or greater than that for 
sample B. 


3 The following are paired measurements obtained for a sample of eight 
subjects under two conditions: 


Condition A 8 17 12 19 5 
Condition B 12 31 17 17 8 


directional test, 


4 Calculate t for the following data: 


Sample A 


Sample B 
————— ÀÀRáÁ 
X 20 25 
N 25 10 
xx? 12,500 7,900 


For a sample of 26 paired measurements ED = 52 and ХЮ? = 
Calculate t. 


What advantages attach to matched groups or paired observations in 
experimentation? 


The means for two independent samples of 10 and 17 cases are 9. 
and 14.16, respectively. The unbiased variance estimates are 64.02 
220.30. Compare the methods proposed by Cochran and Cox with those 
proposed by Welch to test the significance of the difference betwe 
the two means. р 


12.1 


12.2 


TESTS OF SIGNIFICANCE: 
OTHER STATISTICS 


INTRODUCTION 


In Chap. 11 problems associated with the application of tests of signifi- 
cance to arithmetic means were discussed. Not infrequently, tests of 
significance for proportions, variances, correlation coefficients, or other 
Statistics are required. The general rationale underlying the application 
of such tests of significance is precisely the same as that for arithmetic 
means, although the technical procedures used in estimating the required 
probabilities are different. The present chapter discusses procedures for 
applying tests of significance to proportions, variances, 


and correlation 
coefficients, for independent and correlated samples. 


SIGNIFICANCE OF THE DIFFERENCE 
BETWEEN TWO INDEPENDENT PROPORTIONS 


Questions arise in the interpretation of experimental results which require 
à test of significance of the difference between two independent propor- 
tions. The data are comprised of two samples drawn independently. Of the 
N, members in the first sample, f, have the attribute 4, Of the №, members 
in the second sample, f; have the attribute A. The proportions having the 
attributes in the two samples are М, = p, and ЉІМ = р». Can the two 
samples be regarded as random samples drawn from the same population? 


Is p, significantly different from P2? To illustrate, in a public opinion poll 
the proportion .65 іп a sample of u 


rban residents may express a favorable 
attitude toward a particular issue as against a proportion .55 in a sample of 
rural residents. May the difference between the proportions be interpreted 
as indicative of an actual urban-ru 


ral difference in opinion? To illustrate 
further, the proportion of failures in air-crew training in two training periods 


[12.1] 


[12.2] 


TESTS OF SIGNIFICANCE: OTHER STATISTICS 161 — 
may be .42 and .50. Does this represent a significant change in the propor- $ 
tion of failures, or may the difference be attributed to sampling considera- 


tions? 
The standard error of a single proportion is estimated by the formula | 


where p — sample value of a proportion i 

а=1-р 1 
The standard error of the difference between two proportions based on [Б 
independent samples is estimated by 


1 1 
5ь- = МР9 = x) n 
where p is an estimate based on the two samples combined. The value . 
p is obtained by adding together the frequencies of occurrence of the 
attribute in the two samples and then dividing this by the total number | 
“a 


in the two samples. Thus 


fits 4 h 


ж vow 


РЕМ + М, 
where / and № are the two frequencies. 
The justification for combining data from the two samples to obtain | 
a single estimate of p resides in the fact that in all cases where the differ- 
ence between two proportions is tested, the null hypothesis is assumed. | 
This hypothesis states that no difference exists in the population propor- ^ 
tions. Because the null hypothesis is assumed, we may use an estimate | 
of p based on the data combined for the two samples. This procedure is ` 
analogous to that used in the t test for the difference between means for 
independent samples where the sums of squares for the two samples are | 
combined to obtain a single variance estimate. E. 
To test the difference between two proportions we divide the observed | 
difference between the proportions by the estimate of the standard error у 
4 


of the difference to obtain 


pia рз pi pe 
> Ура Гал + QN] 


5ә-іт 


Тһе уаше 2 тау be interpreted as a deviate of the unit normal curve. 
provided № and №, are reasonably large and р is neither very small na | 
very large. Аз usual for a two-tailed test, values of 1.96 and 2.58 are re- 
quired for significance at the 5 and 1 per cent levels. 

How large should the М» be and how far should p depart from extreme’ 
values before this ratio can be interpreted as a deviate of the unit normal 
curve? An arbitrary rule may be used here. If the smaller value of p or j 
multiplied by the smaller value of N exceeds 5, then the ratio ma: be 
interpreted with reference to the normal curve. Thus if p = .60, q A 40 
М, = 20, and М, = 30, the product .40 X 20 = 8 and the normal curve тау В 


used. 


12.3 


BASIC STATISTICS 


To illustrate, we refer to data obtained in a study of the attitudes of 
Canadians to immigrants and immigration policy. Independent samples of 
French- and English-speaking Canadians were used. Subjects were asked 
whether they agreed or disagreed with present government immigration 
practices. In the French-speaking sample of 300 subjects, 176 subjects 
indicated agreement. The proportion p, is 176/300 = .587. In the English- 
speaking sample of 500 subjects, 384 indicated agreement. The proportion 
р» is 384/500 = .768. By combining data for the two samples we obtain a 
value 


_ 176 + 384 _ 


P = 300 +500 — -700 


The value of q is 1 — .700 = .300. The estimate of the standard error of the 
difference is 5,,-,, = V.700 X .300 (sis + sis) = .033. The required z 
value is z = (.768 — .587)/.033 = 5.48. 

Interpreting the value 5.48 as a unit-normal-curve deviate we observe 
immediately that the difference is highly significant. The chances are 
one in a great many millions that the observed difference could result from 
sampling. We may very safely conclude from these data that a real 
difference exists between French- and English-speaking Canadians on the 
question asked. 

An alternative, but closely related, method exists for testing the signifi- 
cance of the difference between proportions for independent samples. 
This method uses chi square and is described in Chap. 13. 


SIGNIFICANCE OF THE DIFFERENCE 
BETWEEN TWO CORRELATED PROPORTIONS 


Frequently in psycholo; 


пс ‚ а psychological test тау be 
administered to a sample of N individuals. The proportions кы 


items 1 and 2 аге р, and D». Paired observations are available for each 


ў йет 2. 
item 2. A third indi- 


[12.3] 


[12.4] 


requires that the correlation between responses be taken into account. | 

We proceed by tabulating the data in the form of a fourfold, or 2 х2 
table. A table with four cell frequencies is obtained. By way of illustration 
assume that the data are “pass” and "fail" on two test items. The data ma: 
be represented schematically as follows: 


Frequencies Proportions 
Item 2 Item 2 
Fail Pass Fail Pass 


Item 1 


о 
$ 
is] 
Item 1 
Е] RJ 
a 
& f 
: B 
3 “| 


The capital letters represent frequencies. The small letters are propor- 
tions obtained by dividing the frequencies by N. The proportions pass- 
ing the two items are р, and р». We wish to test the significance of the 
difference between p; and р». 4 
‘An estimate of the standard error of the difference between two corre 
lated proportions is given by the formula 5 


аға 


Spi-p: = N 


This formula is due to McNemar (1947). It takes into account the correla: 
tion between the paired observations. A normal deviate z is obtained by 
dividing the difference between the two proportions by the standard error 
of the difference. Thus ! 


When the sum of the two cell frequencies, А + D, is reasonably large, 
this ratio can be interpreted as a unit-normal-curve deviate, values of 
1.96 and 2.58 being required for significance at the 1 and 5 per cent levels. 
for a two-tailed test. In this context a reasonably large value of А + 
may be taken as about 20 or above. 

Tt may be shown that the formula for the value of z given above reduces 


to 


54 


12.4 


BASIC STATISTICS 


where А and D are the cell frequencies. For computational purposes this 
is the more useful formulation. | 

To illustrate, consider the following fictitious data relating to attitude 
change. Let us assume an initial testing followed by a program intended 
to produce a change in attitude, and then a second testing with the same 
attitude scale. On a particular question let the data for the two testings 
be as follows: 


Frequencies Proportions 
2d 
Disagree Agree Disagree Agree 


Ist 


Disagree 


E EE E p 
% 
E 
ШЕЛ 


Inspection of the above tables indicates a high correlation in response 
between the first and second testings. We wish to test the significance 


of the difference between .40 and .30. The standard error of the difference 
between the two proportions is 


е2 ү = 0316 


The value of 2 is 


.40 — .30 
-0316 


іші =3.16 

In this сазе the difference is significant. It exceeds the value of 2.58 
required for significance at the 1 per cent level for a two-tailed test, Argu- 
ments may be advanced for the use of a one-tailed test with the above data. 
It may be assumed that knowledge of a program intended to induce atti- 


tude change may warrant a hypothesis about the direction of the change. 
In either case the result is significant. 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN 
VARIANCES FOR INDEPENDENT SAMPLES 


Occasions arise where a test of the significance of the difference between 
the variances of measurements for two independent samples is required. 
In the conduct of a simple experiment using control and experimental 
groups, the effect of the experimental condition may reflect itself not only 
in a mean difference between the two groups but also in a variance differ- 


TESTS OF SIGNIFICANCE: OTHER STATISTICS 165 | 


ence. For example, in an experiment designed to study the effect of a- 
distracting agent, such as noise, on motor performance the effect of the | 
distraction may be to greatly increase the variability of performance, in | 
addition possibly to exerting some effect upon the mean. The variances | 
obtained in any experiment should always be the object of scrutiny and 2 
comparison. À common situation, where a test of the significance of the 
difference between variances is required, is in relation to the t test for the - 
significance of the difference between two means. This test assumes the 
equality of variances in the populations from which the samples are drawn; 
that is, it assumes that с? — т? = o°. This condition is usually spoken of | 
as homogeneity of variance. 

Let s,? and s be two variances based on independent samples. We 
may consider the difference sı? — 5°. An alternate procedure is to consider | 
the ratio 512/55? ог 522/51. If the two variances are equal, this ratio will be 
unity. If they differ and 51° > 5,2, then 5,782 > land 52/52 < 1. A depar- | 
ture of the variance ratio from unity is indicative of a difference between 
variances, the greater the departure the greater the difference. Quite 
clearly, a test of the significance of the departure of the ratio of two vari- _ 
ances from unity will serve as a test of the significance of the difference | 
between the two variances. 

To apply such a test the sampling distribution of the ratio of two vari- 
ances is required. To conceptualize such a sampling distribution, consider 
two normal populations 4 and B with the same variance o°. Draw samples 
of N, cases from A and М, cases from B, calculate unbiased variance 
estimates sı? and 5:7, and compute the ratio 5,*/so*. Continue this procedure 
until a large number of variance ratios is obtained. Always place the 3 
variance of the sample drawn from A in the numerator and the variance 
of the sample drawn from В in the denominator. Some of the variance ratios | 
will be greater than unity; others will be less than unity. The frequency t 
distribution of the variance ratios for a large number of pairs of variances 1 
is an experimental sampling distribution. The corresponding theoretical | 
sampling distribution of variance ratios is known as the distribution of F, = 
The variance ratio is known as an F ratio; that is, F = s;*/sj*, or F = 52/512. 

In the above illustration samples of N, are drawn from one population - 
and samples of № from another. №, — 1 and №, — 1 degrees of freedom аге | 
associated with the two variance estimates. А separate sampling distribu- 1 
tion of F exists for every combination of degrees of freedom. Table D ofthe . 
Appendix shows values of F required for significance at the 5 and 1 per | 
cent levels for varying combinations of degrees of freedom. This table 
shows values of F equal to or greater than unity. It does not show values of 
F less than unity. The number of degrees of freedom associated with the 
variance estimates in the numerator and denominator are shown along the 
top and to the left, respectively, of Table D. The numbers in lightface type 
are the values for significance at the 5 per cent level, and those in boldfa 
type the values at the 1 per cent level. These values cut off 5 and 1 perc 
of one tail of the distribution of F. 

In testing the significance of the difference between two v 


се, 
ent 


ariances, the 


166 


BASIC STATISTICS 


null hypothesis Ho:c;? = 02 = о? is assumed. We then find the ratio of the 
two unbiased variance estimates. These are 


а О)? 
so ү 
апі 

„_ 0 Х,)2 
$77 N—1 


No prior grounds exist for deciding which variance estimate should be 
placed in the numerator and which in the denominator of the F ratio. In 
practice the larger of the two variance estimates is always placed in the 
numerator and the smaller in the denominator. In consequence the F 
ratio in this situation is always greater than unity. The F ratio is calculated, 
referred to Table D of the Appendix, and a significance level determined. 
At this point a slight complication arises. The obtained significance level 
must be doubled. Table D shows values required for significance at the 
5 and 1 per cent levels. In comparing the variances for two independent 
groups these become the 10 and 2 per cent levels. The reason for this 
complication resides in the fact that the larger of the two variances has 
been placed in the numerator of the F ratio. This means that we have 
considered one tail only of the F distribution. Not only must we consider 
the probability of obtaining 51/55? but also the probability of 52/512. Where 
interest is in the significance of the difference, regardless of direction, the 
required per cent or probability levels are simply obtained by doubling 
those shown in Table D. 

Table D has been prepared for use with the analysis of variance (Chap. 
15) which makes extensive use of the F ratio. In the analysis of variance 
the decision as to which variance estimate should be put in the numerator 
and which in the denominator is made on grounds other than their relative 
size. Consequently, in the analysis of variance, F ratios less than unity 
can occur and Table D provides the appropriate probabilities without any 
doubling procedure. 

То illustrate, a psychological test is administered to a sample of 31 boys 
and 26 girls. The sum of squares of deviations, X (X — X)?, is 1,926 for boys 
and 2,875 for girls. Unbiased variance estimates are obtained by dividing 
the sum of squares by the number of degrees of freedom. The df for boys is 
31 — 1 — 30 and for girls 26 — 1 — 25. The variance 
1,926/30 — 64.20 and for girls 2,875/25 — 115.00. 

Are boys significantly different from girls in the variability of their per- 
formance on this test? The F ratio is 115.00/64.20 — 1.79. The df for the 
numerator is 25 and for the denominator 30. Referring this F to Table D we 
see that a value of F of about 1.88 is required for significance at the 5 per 
cent level, and doubling this we obtain the 10 per cent level. It is clear, 
therefore, that the difference between the variances for boys and girls 
cannot be considered statistically significant. The evidence is insufficient 

to warrant rejection of the null hypothesis. 


estimate for boys is 


12.5 


[12.6] 


12.6 


UY ANN TDI VPN 


TESTS OF SIGNIFICANCE: OTHER STATISTICS 167 


5 


SIGNIFICANCE OF THE DIFFERENCE 
BETWEEN CORRELATED VARIANCES 
Given a set of paired observations, the two variances are not independent 
estimates. Data of this kind arise when the same subjects are tested under 
two experimental conditions, or matched samples are used. For example, 
in an experiment designed to study the effects of an educational program 
on attitude change, attitudes may be measured, an educational program 
applied, and attitudes тетеазиге4. It may be hypothesized that some 
change in variance of attitude-test scores may result. An increase i 
variance may mean that the effect of the program is to reinforce existing 
attitudes, producing more extreme attitudes among individuals at bo! 
ends of the attitude continuum. А decrease in variance may mean that the 
effect of the program is to produce an attitudinal regression to greater 
uniformity. 

If s;? and 55° are the two unbiased variance estimates and гуз is the cor- 
relation between the paired observations, the quantity 


_ (52—12) VN=2 
t= Vases r) 


has a t distribution with N — 2 degrees of freedom. 

By way of illustration let sı? and 5г° be unbiased variance estimates о 
attitude scale scores before and after the administration of an educational 
program. Let sı? = 153.20 and 52° = 102.51 where № = 38. The correlation 
between the before-and-after attitude measures is .60. Are the two varian: 
ces significantly different from each other? We obtain E 


... (153.20— 102.51) V88 2... у 52 
‘= gx 153.20 х 102.51(1—.36) ` 
The number of degrees of freedom is 38 — 2 = 36. For significance at th 
5 per cent level, a value of t equal to or greater than about 2.03 is required. 
The evidence is insufficient to warrant rejection of the null hypothesis. We 


cannot argue that the intervening educational program has changed the 
variability of attitudes. 2 


SAMPLING DISTRIBUTION OF THE 
CORRELATION COEFFICIENT 

We may draw a large number of samples from a population, «ак 
correlation coefficient for each sample, and prepare a frequency distribu- 
tion of correlation coefficients. Such a frequency distribution is an ex u- 
mental sampling distribution of the correlation coefficient. To e. с 
casual observation suggests that a positive correlation exists bet ел 
height and weight. A number of samples of 25 cases may be dr ween 
random from a population of adult males, and a correlation coeffi eos 
tween height and weight computed for each sample. These cient be- 


a С Я 
will display variation опе from another. By arranging them in t efficients 


he form of a L 


168 


[12.7] 


[12.8] 


ы BASIC STATISTICS 


frequency distribution an experimental sampling distribution of the cor- 
relation coefficient is obtained. The mean of this distribution will tend to 
approach the population value of the correlation coefficient with increase 
in the number of samples. Its standard deviation will describe the varia- 
bility of the coefficients from sample to sample. A further illustration may 
prove helpful. By throwing a pair of dice a number of times, say, a white 
one and a red one, a set of paired observations is obtained. A correlation 
coefficient may be calculated for the paired observations. Since the two 
dice are independent, the expected value of this correlation coefficient is 
zero. However, for any particular sample of N throws, a positive or a nega- 
tive correlation may result. A large number of samples of N throws may be 
obtained, a correlation coefficient computed for each sample, and a fre- 
quency distribution of the coefficients prepared. The mean of this experi- 
mental sampling distribution will tend to approach zero, the population 
value of the correlation coefficient, and its standard deviation will be 
descriptive of the variability of the correlation in drawing samples of size 
N from this particular kind of population. Note that here, as in all sampling 
problems, a distinction is drawn between a population value and an esti- 
mate of that value based on a sample. The symbol p is used to refer to the 
population value of the correlation coefficient, and r is the sample value. 

The shape of the sampling distribution of the correlation coefficient 
depends on the population value p. As p departs from zero, the sampling 
distribution becomes increasingly skewed. When p is high positive, say, 
p = .80, the sampling distribution has extreme negative skewness. Simi- 
larly, when p is high negative, say, p =—.80, the distribution has extreme 
positive skewness. When p = 0, the sampling distribution is symmetrical 
and for large values of №, say, 30 or above, is approximately normal. The 
reason for the increase in skewness in the sampling distribution as p de- 
parts from zero is intuitively plausible. In sampling, for example, from a 
population where p = .90, values greater than 1.00 cannot occur, whereas 
values extending from .90 to —1.00 are theoretically possible. The range of 
possible variation below .90 is far greater than the range above .90. This 
suggests that the sample values may exhibit greater variability below than 
above .90, a circumstance which leads to negative skewness. 


The standard deviation of the theoretical sampling distribution of p, the 
standard error, is given by the formula 


When p departs appreciably from zero, this formula is of little use, because 


the departures of the sampling distribution from normality make interpre- 
tation difficult. 


Difficulties resulting from the skewness of the sampling distribution 
of the correlation coefficient are resolved by a method developed by R. A. 
Fisher. Values of r are converted to values of Zr using the transformation 


zr = loge (1+г) —4 log, (1— r) 


3 


[12.9] 


12.7 


[12.10] 


«^ PRIN 
TESTS OF SIGNIFICANCE: OTHER STATISTICS 169 | 


Values of z, corresponding to particular values of r need not be computed 5 
directly from the above formula, but may be simply obtained from Table 1 
E in the Appendix. For r — .50 the corresponding z, = .549, for r — 90. ] 
2, = 1.472, and so on. For negative values of r the corresponding z, values % 
may be given а negative sign. In a number of sampling problems involving 
correlation, r's are converted to 2,5, and a test of significance is applied to 4 
the z,’s instead of to the original r's. x 
One advantage of this transformation resides in the fact that the sam- | 
pling distribution of z, is for all practical purposes independent of p. The - ; 
distribution has the same variability for a given N regardless of the size of 
p. Another advantage is that the sampling distribution of 2, is approxi- | 
mately normal. Values of 2, can be interpreted in relation to the normal | 


curve. The standard error of 2, is given by | 
d 
з 273 E 


The standard error is seen to depend entirely on the sample size. j 
The z, transformation may be used to obtain confidence limits for r. — 
Let r= .82 for N= 147. The corresponding z, — 1.157. The standard error у 
of zm given by 1/VN — 3, is .083. The 95 per cent confidence limits are | 
obtained by taking 1.96 times the standard error above and below the 
observed value of Zr, or Zr + 1.965... These are 1.157 + 1.96 х .083 = 1.320 
and 1.157— 1.96 X .083 = .994. These two z,'s may now be converted back 
to г?з, where z, 71.320, r= .867 and where z, — .994, r= .759. Thus we may 
assert with 95 per cent confidence that the population value of the cor- — - 
falls within the limits .759 and .867. In practice we are Й 


relation coefficient 

infrequently concerned with fixing confidence intervals for correlation | 
coefficients. The most frequently occurring problems are testing the sig- 1 
nificance of a correlation coefficient from zero and testing the significance | 


of the difference between two correlation coefficients. 


SIGNIFICANCE OF ^ 

CORRELATION COEFFICIENT б 
е significance of the correlation between а set of paired observa- 
uent problem in psychological research. We begin by assum- г 
hypothesis that the value of the correlation coefficient is 
or Нер = 0. A test of significance may then be applied у 
distribution of t. The t value required is given by the formula | 


Testing th 
tions is à freq 
ing the null 
equal to zero. 
using the 


The number of degrees of freedom associated with this value of t is N 
The loss of 2 degrees of freedom results because testing the significa: 3 
r from zero is equivalent to testing the significance of the slope ofa nce of 
жа The reader will recall regres- 
sion line from zero. ecall that the correlati, 
А on coefficie 
nt 


Ld 


BASIC STATISTICS 


is the slope of a regression line in standard-score form. The Mora of 
degrees of freedom associated with the variability about a straight ine 
fitted to a set of points is two less than the number of observations. A 
straight line will always fit two points exactly, and no freedom to vary is 
possible. With three points there is 1 degree of freedom, with four points 
degrees of freedom, and so on. і 
> ? бшш an example where г = .50 and № = 20. We obtain 


t= .50 


P 20—2 —245 


1 0? 


The df — 20 — 2 = 18. Referring to the table of t, Table B in the Appendix, 
we find that for this df a t of 2.10 is required for significance at the 5 per 
cent level and a t of 2.88 at the 1 per cent level. The sample value of r falls 
between these two values. It may be said to be significant at the 5 per cent 
level. 

Table F of the Appendix presents a tabulation of the values of r re- 
quired for significance at different levels. We note that where the number 
of degrees of freedom is small, a large value of ris required for significance. 
For example, where df — 5, a value of r > .754 is required before we can 
argue at the 5 per cent level that the r is significant. Even for df= 20, a 
value of r S .423 is required for significance at the 5 per cent level. This 
means that little importance can be attached to correlation coefficients 


calculated on small samples unless these coefficients are fairly substantial 
in size. 


12.8 SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO 
CORRELATION COEFFICIENTS FOR INDEPENDENT SAMPLES 


Consider a situation where two correlation coefficients, гү 
tained on two independent samples. The correlation coefficients may, for 
example, be correlations between intelligence-test scores and mathe- 
matics-examination marks for two different freshman classes. We wish 
to test whether r, is significantly different from Ta, that is, whether the 
two samples can be considered random samples fro 
tion. The null hypothesis is Нор, = р» or Нер — р» = 0. 

The significance of the difference between rı and г» can be readily 
tested using Fisher’s z, transformation. Convert гү and г to z,'s using 
Table E of the Appendix. As stated previously, the sampling distribu- 
tion of z, is approximately normal with a standard error given by s,,— 
1/V N —3. The standard error of the difference between two values of 
z,is given by 


and г», are ob- 


m a common popula- 


[12.11]. Sen-za = Уз. + 51, = 


" 3'N,-3 
By dividing the difference between the two values of z, by the standard 
NE error of the difference, we obtain the ratio 


EN n 
242] 2= JO, —3) + 10-3) 


12.9 


[12.13] 


. "Y "ры yv 7, Md uw "T €- 
қ - TS 
TESTS OF SIGNIFICANCE: OTHER STATISTICS 171 


Zn žr 


This is a unit-normal-curve deviate and may be so interpreted. Values of 
o a 2.58 are required for significance at the I per cent and. 5 perc] 
evels. j 
To illustrate, let the correlations between intelligence scores and mathe- 
matics-examination marks for two freshman classes be .320 and .720. Let 
the number of students in the first class be 53 and in the second 23. Are 
the two coefficients significantly different? The corresponding 2 value E 
obtained from Table E of the Appendix are .332 and .908. The requie 4 


normal deviate is 5 
\ 
.908 — .332 
————À— 2.18 қ 
2-11-33 103—3) 
The difference between the two correlations is significant at the 5 рег 


level. 

The application 
simple. The interpretation 0! 
may be difficult. 


of a test of significance in a situation of this kind is 
f what the difference in correlation means 


SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO 
CORRELATION COEFFICIENTS FOR CORRELATED SAMPLES 


Consider a situation where three measurements have been made on the 
same sample of individuals. Three correlation coefficients result, гуз, г, я 
and гоз. If we wish to compare гу; and ris, OF гу an "E EC oa 
method described in the preceding section does not apply. Here dr 
coefficients under comparison are not based on independent samples put 
are based on the same sample and are correlated. 1 

To test the difference between ri» and гуз under these conditions, «d 


may calculate a value t by the formula М 


Ln) VN SETA. 


b= /2(1 — ne — nj! hs + 2гзГззГәз) 


This expression follows the distribution of t with N — 3 degrees of free- 
dom. Note that to apply this test the correlation ro; is required. р: 

Let Х and Хз be two psychological tests used to predict a criterion 
measure of scholastic success Х,. The three correlation coefficients based 
on a sample of 100 cases аге л; = .60, гуз = .50, and г; = .50. Are X; and 
Xs significantly different as predictors of scholastic success? Is there a | 
reasonable probability that the difference between the two correlations - 
түз and газ Сап be explained in terms of sampling error? The value of ¢ н: ү 


(.60 — .50) (100 — 3) (1+ .50) 


60 — .50* — .50* + 2 X .60 X .50 X .50) = 1.29 


ВЕ 


172 


| 


BASIC STATISTICS 


For df= .97, at of about 1.99 is required for significance at the 5 per cent 
level. In consequence, the difference between the two correlation coef- 
ficients cannot be said to be significant. i 

The above test has certain restrictive assumptions underlying its de- 
velopment and because of these is perhaps not entirely satisfactory. 


дм 


| EXERCISES 1 


2 


10 


Given two random samples of size 100 with sample values p, — .80 
and рә = .60, test the significance of the difference between pi and р». 


Consider two test items 4 and B. In a sample of 100 people, 30 pass 
item 4 and fail item B, whereas 20 fail item A and pass item B. Are 


the proportions passing the two items significantly different from each 
other? 


In a market survey 24 out of 96 males and 63 out of 180 females indi- 
cate a preference for a particular brand of cigarettes. Do the data 
warrant the conclusion that a sex difference exists in brand pref- 
erence? 


On an attitude scale, 63 and 39 individuals from a sample of 140 
indicate agreement to items А and B, respectively, and 29 individuals 
indicate agreement to both items. Is there a significant difference in 
the response elicited by the two items? 


Given two independent samples of size 20 with s,* —400 and 5° = 625, 


test the hypothesis that the variances are significantly different from 
each other. 


Given two correlated samples of size 20 with 5,2 = 400, s? = 625, 


and rj; = .7071, test the hypothesis that the variances are signifi- 
cantly different from each other. 


Calculate values of z, for r= :70, r= .05, r— —.60, and г= —.99. 


Calculate, using formula [12.10], values of t for the following values 
of r and N: 


r 20 -30 40 -50 


The correlation between psychological-test scores and academic 
achievement for a sample of 147 freshmen is .40. The corresponding 
correlation for a sample of 125 sophomores is -59. Do these correla- 
tions differ significantly? 


Three psychological tests are administered to a sample of 50 students. 
The correlations are ri; = .70, гуз = .40, and r4 = .60. Is ав, 
cantly different from гз? 


13.1 


THE ANALYSIS OF FREQUENCIES р 
USING CHI SQUARE ї 


INTRODUCTION 
We have previously discussed the application of the binomial, normal, t 
and F distributions. Another distribution of considerable theoretical antl 
practical importance is the distribution of chi square, or x*. In many ex- : 
perimental situations we wish to compare observed with ns, 
frequencies. The observed frequencies are those obtained empirically b 
direct observation or experiment. The theoretical frequencies are "S 
erated on the basis of some hypothesis, or line of theoretical мт 
which is independent of the data at hand. The question arises as to ннен 
the differences between the observed and theoretical frequencies are si A 
nificant. If they are, this constitutes evidence for the rejection of e 
hypothesis or theory that gave rise to the theoretical frequencies 7 
Consider, for example, a die. We may formulate the hypothesis that the 
4. in which case the probability of throwing any of the six pos- i 
gle toss is $. The frequencies expected on the Basis of 
this hypothesis are the theoretical frequencies. In a series of 300 throws 
the expected or theoretical frequencies of 1, 2, 3, 4, 5, and 6 are 50, 50, 50, | 
50. 50, and 50. Let us now experiment by throwing the die 300 tss The . 
observed frequencies of the values from 1 to 6 are 43, 55, 39, 56, 63, and 44. 
May the differences between the observed and theoretical frequencies be 
considered to result from sampling error? Are the differences highly 
improbable on the basis of the null hypothesis, thereby providing evidence у 
for the rejection of the hypothesis that the die is unbiased? Е 
Аза further illustration, let us formulate the hypothesis that in litters of — 


rabbits the probability of any birth being either male or female is $. Using 
173 


die is unbiase 
sible values in а sin 


yen ЗАИР ““ 
, 
BASIC STATISTICS 


the binomial distribution we ascertain that the expected or theoretical 
frequencies of 0, 1, 2, 3, 4, 5, and 6 males in 64 litters of 6 rabbits are 1, 6, 
15, 20, 15, 6, and 1. By counting the number of males in 64 litters of six 
rabbits, the corresponding observed frequencies are 0, 3, 14, 19, 20, 6, and 
2. Do the observed and theoretical frequencies differ? 

Consider another example. In a market research project two varieties of 
soap, A and B, are distributed to a random sample of 200 housewives. After 
a period of use the housewives are asked which they prefer. The results 
indicate that 115 prefer A and 85 prefer В. The hypothesis may be 
formulated that no difference exists in consumer preference for the two 
varieties of soap; that a 50:50 split exists. Do the observed frequencies 
constitute evidence for the rejection of this hypothesis? 

The statistic x? is used in situations of the type described above where a 
comparison of observed and theoretical frequencies is required. It has 
extensive application in statistical work. For our purposes here х is 
defined by 


0- Е)? 
(13.1) vey 


where О = ап observed frequency 
Е = an expected or theoretical frequency 


Thus to calculate a value of x? we find the differences between the 
observed and expected values, square these, divide each difference by the 
appropriate expected value, and sum over all frequencies. 

Table 13.1 illustrates the calculation of x? in comparing the observed and 
expected frequencies for 300 throws of a die. Note that the sum of both the 
observed and expected frequencies is equal to N; that is, ХО = XE = М. 
The value of x? obtained in Table 13.1 is 8.72. This is a measure of the dis- 


Calculation of X! in comparing observed and expected frequencies for 
300 throws of a die 


Value of | Observed Expected 


die frequency — frequency 
0-Ey 
X о Е 0-Е (О-Ер Е 
ee Р. 
1 43 50 7. 49 .98 
2 55 50 5 25 .50 
3 39 50 =j 121 2.42 
4 56 50 6 36 32 
5 63 50 13 169 3.38 
6 44 50 2-19 36 179, 
Total 300 300 0 es X872 


13.2 


VP fe eee se 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 175 


crepancy between the observed and theoretical frequencies. If the discrep- 
ancy is large, x? is large. If the discrepancy is small, x? is small. Does 6 

value of x? = 8.72 consitute evidence at an accepted level of significance . к 
for rejecting the null hypothesis? The answer to this question demands а | 
consideration of the sampling distribution of х2. Я 


THE SAMPLING DISTRIBUTION OF CHI SQUARE 


The sampling distribution of x* may be illustrated with reference to the | 
tossing of coins. Let us assume that in tossing 100 unbiased coins 46 heads | 
and 54 tails result. The expected frequencies are 50 heads and 50 tails. А. 


value of x? may be calculated as follows: 


(0-Е) 

0 Е 0-Е (0-Е) ТЕ. 
eS ЕБЕ а. 
H 4 50 -4 16 32 
T 54 5 +4 16 L 
х=.64 


In the tossing of 100 coins two frequencies are obtained, one for heads. 
and one for tails. These frequencies are not independent. If the frequency 
of heads is 46, the frequency of tails is 100 — 46 = 54. If the frequency [ 
heads is 62, the frequency of tails is 100 — 62 — 38. Quite clearly, given 
either frequency, the other is determined. One frequency only is free to 
vary. In this situation 1 degree of freedom is associated with the value of x°. 

Let us toss the 100 coins а second time, a third time, and so on, to obtain 
different values of x*. A large number of trials may be made, and a large 
number of values of X? obtained. The frequency distribution of these values - 
is an experimental sampling distribution of x? for 1 degree of freedom. It 
describes the variation in x? with repeated sampling. By inspecting tl 
tal sampling distribution estimates may be made of the propor- 
es, or the probability, that values of x? equal to or greater than 
alue will occur due to sampling fluctuation for 1 degree of 
he present illustration this assumes, of course, that the coins 


experimen 
tion of tim 
any given У 
freedom. In t 


are unbiased. 

Instead of tossing 100 coins, let us throw 100 unbiased dice, obtai 
observed and expected frequencies, and calculate a value of 2. In this situ- 
ation if any five frequencies are known, the sixth is determined. Fiv 
degrees of freedom are associated with the value of X? obtained. The 100° 
dice may be tossed a great many times, a value of x° calculated for each 
trial, and a frequency distribution made. This frequency distribution is an 
experimental sampling distribution of x? for 5 degrees of freedom. — 

The theoretical sampling distribution of x? is known, and proba 
may be estimated from it without using the elaborate experiment: 


176 


Fig. 13.1 


BASIC STATISTICS 


0.50 


0.40 


0.30 


0.20 


Fx), relative frequency 


0.10 


M 123456789 10 11 12 13 14 15 16 17 18 19 20 

х? 
Chi-square distribution and 5 per cent critical regions for various 
degrees of freedom. (From Francis С. Cornell, The essentials of 
educational statistics, John Wiley & Sons, Inc., New York, 1956.) 


approach described for illustrative purposes above. The equation for x? is 
complex and is not given here. It contains the number of degrees of 
freedom as a variable. This means that a different sampling distribution of 
X? exists for each value of df. Figure 13.1 shows different chi-square dis- 
tributions for different values of df. x? is always positive, a circumstance 
which results from squaring the difference between the observed and 
expected values. Values of x? range from zero to infinity, The right-hand 
tail of the curve is asymptotic to the abscissa, For ] degree of freedom the 
curve is asymptotic to the ordinate as well as to the abscissa, 

The x? distribution is used in tests of significance in much the same way 
that the normal, t, or the F distributions are used. The null hypothesis is 
assumed. This hypothesis states that no actual differences exist between 
the observed and expected frequencies, A value of x? is calculated. If this 


value is equal to or greater than the critical value required for significance 


sponding 5 and 1 per cent critical values are 11.07 and 15.09. 

Table С of the Appendix provides the 5 and 1 per cent critical values for 
df= 1 to df= 30. This covers the great majority of situations ordinarily 
encountered in practice. Situations where a X! is calculated based on a df 
greater than 30 are infrequent. Where df is greater than 30 the expression 
V — V9df—] has a sampling distribution which is approximately 


E 


13.3 


Table 13.2 


THE ANALYSIS ОЕ FREQUENCIES USING CHI SQUARE 177 


normal. Values of this expression requi igni 
cene c edt “er іле quired for significance at the 5 and 1 
The above discussion of x? has undoubtedly given the reader the im 

sion that x? is a statistic which is exclusively concerned with the бие: 
son of observed and expected frequencies. The distribution of x? is, in per 
a general statistical distribution and its use in the study of frequencies 
one approximate, particular application. The more general approach 
defines x? as the sum of squared normal deviates. Consider a population 
with mean и, variance 0”, and a normal distribution of scores Y. А squaread 
standard score drawn from this population is z? = (Y — ш)2/02. If members 
are drawn from their population one at a time, the frequency distribution o 
z? will be a x? distribution with 1 degree of freedom. If members are drawn 
two at a time, the distribution of 21? + z; will be a x? distribution with 2 
degrees of freedom. In general for samples of size N the quantity Sz? has a 
x? distribution with М degrees of freedom. $ 


GOODNESS OF FIT 
Numerous examples may be found to illustrate the goodness of fit of a theo 
retical to an observed frequency distribution. In one experiment Abbe 
Mendel observed the shape and color of peas in a sample of plants. The dis. 
tribution he obtained is shown in Table 13.2. According to his genetic 
theory the expected frequencies should follow the ratio 9:3:3:1. The corre- 
spondence between observed and expected frequencies is close. The valu 
of x? = .470, and no grounds exist for rejecting the null hypothesis. The 
data lend confirmation to the theory. The value of x? is smaller than м 
should ordinarily expect, the probability associated with it being between 
.90 and .95. Assuming the null hypothesis, a fit as good or better than the 
one observed may be expected to occur in between 5 and 10 per cent o 
samples of the same size. 

In testing goodness of fit the hypothesis may be entertained that the dis- 
tribution of a variable conforms to some widely known distribution such as 
the binomial or normal distribution. Johnson (1949), in order to illustrat 
the goodness of fit of the theoretical binomial distribution to an observed 


Comparison of observed and expected frequencies in shape and color of 


peas in experiment by Mendel м 

(0 ЕХ 
о Е 0-Е (0—Ey Е 

Ea 

Round yellow 315 312.75 2.25 5.06 1016 

Round green 108 104.25 3:15 14.06 .135 4 

Angular yellow 101 104.25 —8.25 10.56 1101 

Angular green 32 3475 —2Л5 7.56 .218 E 
x= .470 


Total 556 556 0.00 Tiu 


178 


Table 13.3 


BASIC STATISTICS 


distribution, tossed 10 coins 512 times and recorded the proportion of tails. 
His data are shown in Table 13.3, together with the corresponding theoreti- 
cal binomial frequencies. The mean and standard deviation of the observed 
distribution are X = 0.5 and s = .162. The mean and standard deviation of 
the theoretical binomial are X — 0.5 and s — .156. 

Note that in the calculation of x? for these data, the small frequencies at 
the tails of the distributions are combined, a procedure that is generally 
advisable with data of this type. Problems in the application of chi square 
resulting from the presence of small frequencies are discussed later in this 
chapter. With the present data, combining small frequencies reduces the 
number of frequencies from 11 to 9 and the number of degrees of freedom 
from 10 to 8. The value of x? for these data is 9.55. The value required for 
significance at the 5 per cent level for 8 degrees of freedom is 15.51. The 
conclusion is that the evidence is insufficient to justify rejection of the null 
hypothesis. Reference to a table of x? shows that a value of x? equal to or 
greater than the one observed might be expected to occur in about 30 per 
cent of samples due to sampling fluctuation alone. 

Where the theoretical frequency distribution is continuous, we require a 
method for the estimation of the theoretical frequencies. In fitting a contin- 
uous curve we calculate the proportion of the area under the theoretical 
curve corresponding to each class interval. This proportion multiplied by № 
is taken as the theoretical frequency within the class interval. This 
procedure is illustrated in Table 13.4. The data are adapted from МсМетаг 


Goodness of fit of binomial distribution to observed distribu- 
tion of proportion of tails from 512 tosses of 10 coins* 


Proportion (O— Ey 
of tails о Е О-Е Е 
i (— ———————_. 
" 1.0 2 0.5 
5 0.9 г) í aa) $9 15 0409 
* 0.8 15 22.5 “51,5 2.500 
" 0.7 68 60.0 8.0 1.067 
0.6 105 105.0 0.0 0.000 
0.5 134 126.0 8.0 0.508 
0.4 95 105.0 —10.0 0.952 
0.3 55 60.0 50 0.417 
0.2 23 22.5 0.5 0.011 
0.1 8 5.0 
| — zho 55 4.5 3.682 
Total 512 512 X = 9.546 


* Adapted from Palmer Johnson, Statistical methods in research, Prentice-Hall, 
Inc., Englewood Cliffs, N.J., 1949. 


Е. 


101, 


Т 66-06 
2117 67-0 
[4! 1700” 1#00` 2592- 905%- 566 1 65—05 
9v 5610" 9610 1902- 90<2- “69 8b 69-09 
OST 90S0' $0107 SL l— 90<2- 64. 081 62-02. 
058 LAT 6L8U 988'— 90'sL— 568 BEE 68-08 
6LS 0S6U 6086” 865'— 90°S— “66 26< 66-06 
889 9185" 919° 167 6% ©`601 612 601-001 
585 8561" £018" 628° VOL 611 019 61-011 
(453 98ГГ 6806” 9or't 7695 S'6ct oee 601-001 
ZSI 5150° 1086” 150% POPE S'6£t 001 661-061 
Lb 8510” 6566” 920 vovv ©`6®1 ss [0203021 
zI 1#00` 651—061 
ЄТ 
9t js -091 
4 Kəuənbəaf unnm тојәд sx хо mur о ]023491u1 
рарэах | uonaodoaq — uogaodoaq мрәш 1104}  4odd[) $811) 
uoyniaeg 
8 L 9 $ Li t © қ Т 


„И мало; “в, р 1әч1{-рлоуию1$ ло) sorouonboajy uonnqunsıp [euLIOU jo uone) — FET OLL 


130 


BASIC STATISTICS 


(1969) and are Stanford-Binet IQ's, Form М, for a sample of 2,970 indi- 
viduals. ы 
We аге required to calculate the theoretical normal frequencies for the 
class intervals and test the goodness of fit between the theoretical and the 
observed. The mean and standard deviation are X = 104.56 and s = 16.99. 
A normal frequency distribution is required with the same N, X, and s as 
the observed distribution. We proceed by combining the small frequencies 
at the tails of the distribution, as shown in col. 2. This reduces the number 
of frequencies from 14 to 11. The frequency of 16 at the top of the distribu- 
tion contains all cases above the exact limit 149.5. The frequency of 12 at 
the bottom of the distribution contains all cases below the exact limit 59.5. 
We next record the exact upper limits of the class intervals (col. 3), convert 
these to deviations from the mean of 104.56 (col. 4), and divide by the stan- 
dard deviation 16.99 to obtain the standard score x/s (col. 5). Thus the exact 
upper limits of the class intervals are expressed in standard measure. For 
example, the exact upper limit of the interval 140 to 149 is 149.5. This as a 
deviation from the mean is 149.5 — 104.56 = 44.94. Dividing this by the 
standard deviation 16.99, we obtain 44.94/16.99 — 2.645. We then consult a 
table of areas under the normal curve and ascertain the proportions of the 
area under the normal curve falling below the standard-score values x/s of 
col. 5. These proportions are shown in col. 6. We observe that a proportion 
:9959 of the area of the normal curve falls below 2.645 standard deviation 
units above the mean, a proportion .9801 falls below 2.057 units above the 
mean, and so on. By subtraction we obtain the proportions of the area of 
tlie normal curve falling within the class intervals (col. 7). The proportion 
above the exact limit 149.5 is 1.0000 — .9959 — .0041. The proportion 
between 139.5 and 149.5 is .9959 — .9801 = .0158. The proportion between 
129.5 and 139.5 is .9801 — .9289 = .0512, and зо on. By multiplying these 
proportions by N we obtain the expected frequencies (col. 8). 

The above method simply involves converting the exact limits of the 
class intervals to standard deviation units, using a table of areas under the 
normal curve to find the proportion of the area within these limits and mul- 
tiplying these proportions by N to obtain the expected frequencies. 

Table 13.5 shows the calculation of x? in comparing the observed with 
expected frequencies. A value of x? — 17.02 is obtained. In this case the 
number of df is 11 —3— 8. Although there are 11 frequencies, 8 only are 
free to vary. The loss of 3 degrees of freedom results because the observed 
and expected distributions are made to agree on №, X, and s. For df = 8, the 
value of x? required for significance at the 5 per cent level is 15.51 and at 
the 1 per cent level 20.09. The obtained x? falls between these two values at 
about the 3 per cent level. Thus the chances are 3 in 100 that a fit as good or 
worse than the one observed would result in random sampling from a 
normal population. This establishes grounds for rejecting the hypothesis 
that the distribution of Stanford-Binet IQ's, Form M, is normal. The depar- 

tures from normality are, however, not gross. 
Chi square may be used to test the representativeness of a sample where 
certain population values are known. This in effect is a test of goodness of 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 181 


Table 13.5 Goodness of fit of normal frequencies to frequency distri- 


bution of Stanford-Binet 10%. form M* 


Class 
interval о Е O-E [еш 
160- he 
150-159 13 12 4 1.33 
140-149 55 47 8 1.36 
130-139 120 152 —32 6.74 
120-129 330 352 —22 1.38 
110-119 610 582 28 1.35 
100-109 719 688 31 1.40 
90-99 592 579 13 29 
80-89 338 350 —12 41 
70-79 130 150 —20 2.67 
60-69 48 46 2 09 
50-59 7 12 0 00 
40-49 4712 
30-39 1 
Total 2,970 2,970 20 x? = 17.02 


ics, 4th ed., John 
E 


McNemar, Psychological. st 


* Adapted. from Quinn 
lew York, 1969. 


Wiley & Sons, Inc. N 


e, ina study of attitudes toward immigrants, а sample of 200 
city of Montreal. The observed frequencies and 
percentages b gin are shown in Table 13.6. 

percentages obtained from census returns are also | 


The population 
shown. These population percentages are used to obtain the expected, or 


fit. To illustrat 
cases is drawn from the 
y racial ori 


Table 13.6 Application of x? in comparing sample frequencies of racial origin with popula- 4 


tion frequencies 


Racial Sample. Population, (9-Е 

origin о рег cent per cent Е 0-Е Е 

French 95 47.5 62.5 125 —30 7.20 E 

English 67 33.5 19.4 39 28 20.10 

Other 38 19.0 18.1 36 2 ll y 
0 о x-2.4 : 


Total 200 100.0 100.0 20 
бы. И LL  ' ы Е. 


182 


13.4 


Table 13.7 


BASIC STATISTICS 


theoretical, frequencies. The value of X is 27.41. For df= 2 this is highly 
significant, the value required for significance at the 1 per cent level being 
9.21. We may conclude that the sample is biased and cannot be considered 
a random sample with respect to racial origin. Since attitudes toward 
immigrants may be linked to racial origin, results obtained on this sample 


may be highly questionable unless a correction is applied to adjust for the 
sample bias. 


TESTS OF INDEPENDENCE 
A frequent application of chi Square occurs where the data are comprised 
of paired observations on two nominal variables. We wish to know whether 
the variables are independent of each other or associated. To illustrate, 
Table 13.7 presents data collected by Woo (1928) on the relationship 
between eyedness and handedness in a sample of 413 subjects Subjects 
were tested for eyedness and handedness and grouped in one of three 
categories on both variables Paired observations were available for each 
subject. One subject was left-handed and ambiocular, another right- 
handed and right-eyed, and so on. The paired observations were entered 
in a bivariate frequency table as shown in Table 13.7. Such tables 
are analogous to correlation tables. They are used to study the indepen- 
dence or association of the two variables. Tables of this kind are spoken of 
as contingency tables. With such tables chi square provides an appropriate 
test of independence. 
Contingency table showing relationship between eye and hand lat- 
erality for 413 subjects and calculation of expected values 
Left-eyed Ambiocular Right-eyed Total 
Left-handed 34 62 28 124 
(35.4) (58.5) (30.0) 
Ambidextrous 27 28 20 75 
(21.4) (35.4) (18.2) 
Right-handed 51 105 52 214 
(61.1) (101.0) (51.8) 
Total 118 195 100 413 
Calculation of expected values: 
124 x 118 _ 124 х 195 _ 124 x 100 _ 
MN C54 a 7585 SO = 300 
75 X 18 _ 75x195 _ 15х10. 
UNE -214 713 35.4 di = 18.2 
214 x 195 214 x 100 
US = 611 чз “VLO — “518 


ВАТТ, РАСТ 


i 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 183 * 
% 


In applying chi square to a contingency table to test independence, th 
expected cell frequencies are derived from the data. The Sead IA 
frequencies are those we should expect to obtain if the two variables wel : 
independent of each other, given the marginal totals of the rows aid d 
columns. Chi square provides a measure of the discrepancy between th 4 
observed cell frequencies and those expected on the basis of їндєр 
dence. If the value of chi square is considered significant at some accepted 1 
level, usually either the 5 or the 1 per cent level, we reject the null 
hypothesis that no difference exists between the observed and expected - 
values. We then accept the alternative hypothesis that the two variables | 


are associated. 
How are the expected cell frequencies calculated? The marginal totals i 
b 


D 
to the right of Table 13.7 show that 124. subjects were left-handed, 75 Y 
ambidextrous, and 214 right-handed. The proportions in these duod ч 
categories are 133, Ў and 21%. These proportions are the probabilities E 
that an individual, selected at random from the sample of 413 individuals, is j 
left-handed, ambidextrous, or right-handed. The marginal totals at the ‘ 
bottom of Table 13.7 show that 118 subjects are left-eyed, 195 ambiocular, | 2) 
and 100 right-eyed. The proportions in these three categories are 11$ mi ж 
and 499. These are the probabilities that an individual is left-eyed, атр 
ular, or right-eyed. Assuming the independence of the two variables, what 
are the expected probabilities associated with the joint events, or what is | 
the expected proportion of left-eyed people who are left-handed, of 
left-eyed people who are ambiocular, and so on? The multiplication 
theorem of probability states that the joint occurrence of two or more mutu- | 
ally independent events is the product of their separate probabilities. The — 
joint probabilities are obtained, therefore, by multiplying the probabilities К 
obtained from the marginal totals. The probability that any individual $ 
selected at random from the 413 individuals, is left-handed is 43%, The 
y that any individual is left-eyed is TH. If handedness and | 
independent, the probability that any individual is both. 
ft-handed is the product of the separate probabilities, or 344 
x #4. This is the expected proportion in the top left-hand cell in Table 
13.7. We require, however, not the expected proportion, but the expected | 
frequency. This is obtained by multiplying the expected proportion by №, in. | 
this case 413. Thus the expected frequency is (443 x H$) 413 = (124 x 
118) /413 =35.4. We observe that for computational purposes the expected 3 
cell frequency is obtained by multiplying together the first row and column 
totals and dividing by N. Similarly, the other expected cell frequencies may 
be obtained. The expected frequency of left-handed ambiocular individuals | 
is (124 X 195)/413 = 58.5, that of left-handed right-eyed individuals is | 
(124 x 100) /413 = 30.0, and so on. The expected cell frequencies are 
shown in parentheses in Table 13.7. 
If eye and hand laterality are independent of each other, the 124 observa- я 


tions in the first row of Table 13.7 will be distributed in the three cells in - 


that row in a manner proportional to the column sums. The expected | 


probabilit 
eyedness are 
left-eyed and le 


184 


Table 13.8 


BASIC STATISTICS 


values 35.4, 58.5, and 30.0 are Proportional to the column sums 118, 195, 
and 100. Likewise, the 118 cases in the first column will be distributed in 
the three cells in that column in a manner proportional to the row sums. 
The expected values 35.4, 21.4. and 61.1 are proportional to the row sums 
124, 75, and 214. A similar proportionality exists throughout the table. The 
expected cell frequencies in the rows and columns of any contingency 
table are proportional to the marginal totals. 

The calculation of x? for a contingency table is similar to that for tests of 
goodness of fit. The difference between each observed and expected value 
is squared and divided by the expected value, obtaining (О — E)*[E. These 
values are summed over all cells to obtain x?. The calculation is perhaps 
most readily accomplished by writing the data in columnar fashion as 
Shown in Table 13.8. The value of X? obtained is 4.021. The number of 
degrees of freedom associated with this value of x? is 4. The value of x 
required for significance at the 5 per cent level is 9.488. We have therefore 
no grounds for rejecting the hypothesis of independence between eye and 


hand laterality. Apparently there is no relationship between these two vari- 
ables. 


Calculation of Х for data of table 13.7 


(0—Ey 

о Е 0-Е (0-Е) Е 
Е сЕ 
34 35.4 —1.4 1.96 .055 
62 58.5 3.5 12.25 -209 
28 300 —2.0 4.00 133 
27 214 5.6 31.36 1.465 
28 — 354 —74 54.76 1.547 
20 18.2 1.8 3.24 .178 
57 611 -41 16.81 275 
105 1010 4.0 16.00 .158 
52 518 2 04 001 


ated? In testing 
of R rows and C 


; grees of freedom 
is (2 — 1) (2 — 1) = 1. Consider for explanatory Purposes the 2 X 2 table: 


[13.2] 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 185 


A, 4. 
! 
в | 5 | 35 | 40 
к ВИННИ ` 


Given the restrictions of the marginal totals, if one cell value is known, the 
remaining three values are determined. Thus if we know that the value in 
the top left cell is 25, the top right cell must be 60 — 25 =35, the bottom left 
30 — 25 = 5, and the bottom right 40 — 5 = 35. If one cell value is known, 
no freedom of variation remains. One degree of freedom only is associated 
with the variation in the data. Similarly, in Table 13.7 only four cell values 


are free to vary. Given the marginal totals and four cell values, the 


remaining cell values are determined. 
ently occurring type of contingency table is the 2 X 2, or fourfold, 


A x? test for independence can be readily obtained for- 
lculating the expected values. Let us represent the © 
ncies by the following notation: 


А frequ 
contingency table. 
such a table without са! 
cell and marginal freque 


A+CB+D N 


may then be calculated by the formula 


"" N(AD — BC)? 
Y 7G B)(C + D)(A C) +D) 


Chi square 


E 
numerator, AD — BC, is simply the difference 
ducts and the term in the denominator is the 


Note that the term in the ‘ 
between the two cross pro 


product of the four marginal totals. 
Consider the following 2х 2 table showing the relationship between 


ratings of successful or unsuccessful on a job and pass or fail on an 


ability-test item. 


- BASIC STATISTICS 


Test item 


Fail Pass 
| ыйыы mera 


Successful | 20 | 40 | 60 


Job rating 


Unsuccessful 25 15 | 40 


45 55 100 


Is there an association between performance on the job and performance 
on the test item? Does the item differentiate significantly between the suc- 
cessful and unsuccessful individuals? Chi square is as follows: 


2 — 100(20 x 15 — 40 x 95)2 
60 X 40 X 45 x 55 


=8.25 


For df= 1, a у? = 8.2535 significant at better than the 1 per cent level. The 
data provide fairly conclusive evidence that the test item differentiates 
between individuals on the basis of their job performance. 


13.5 THE APPLICATION OF х: IN TESTING THE SIGNIFICANCE 
OF THE DIFFERENCE BETWEEN PROPORTIONS 


and females: 


Frequency Proportion 
A— $a аа аа ИНЕ 
5 
Agree Disagree 


Agree Disagree 


n р 
+ ESEN us 


90 110 200 450 


1.00 


:550 1.00 


) wel KP MURAT CTUM АСТ РВ. 
; 


à 
THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 187 


The number of males and females are №, = 140 and №, = 60, respectivel, 

The proportions of males and females indicating agreement to de aci ; 
statement are pi 2% = .500 and p, = $$ = .333. Is there a si; и : 4 
difference in the attitudes of males and females? To apply te na 
previously described we calculate a proportion p based on a combinati ү E 
data for the two samples. With the above data NN 


_ 10420 _ 90 — 
р= то 60—300 0 
а=1-р=1- .450 = .550 


The required normal deviate is then 


Pi— рг 


MENU ва. 
2= pal iN) + Q/N2] 


.500—. 
500 — .333 = 2.112 


NENNEN ee 
450 X .550 (rio + в) 


The difference between the two proportions falls between the 5 and l per | 
cent levels. Reference to a table of areas under the normal curve shows | 
that the proportion of the area falling beyond plus and minus 2.172 standard қ 
deviation units from the mean is close to .03. The difference may be taken 
as significant at about the 3 per cent level. Let us now apply the formula for . 
calculating x? for a 2 х 2 contingency table to the same data. We have 


542 N(AD — BC)? 

X q+ B)(C+D)(A + C)(B +D) 

X 200(70 x 40 — 70 X 20)? =4.717 

140 x 60 x 90 x 110 E 

with 1 degree of freedom, we observe that the | 
he tail of the distribution of x? is about .03 and the | 
difference between proportions may be said to be significant at about the 3 
We observe also that xX? = (2.172)? = 4.717. The two - 
procedures for testing the significance of the difference between propor- 
tions for independent samples lead to identical results. From a computa- 
tional viewpoint the x’ test is the more convenient. Considerations рег. . 
all frequencies apply also to the application of x? in testing 
the significance of the difference between proportions (Sec. 13.6). 

Where the data are correlated and are composed of paired observations. 
the normal deviate for testing the significance of the differences between: 


proportions is given (Sec. 12.3) by the formula 


Consulting a table of x? 
proportion of the area int 


per cent level. 


taining to sm 


eas 

VA+D 
where D and A are cell frequen 
respectively, of a 2x 2 table. Instead of calcul 
referring this to the normal curve, we may calculate Х 


cies in the bottom right and top left cells, 
ating а critical ratio and 
2 by the formula 


188 


BASIC STATISTICS 


a (D—A)? 
[3.3] № =F 


For the data shown т Sec. 12.3, where we wish to test the significance of 
the differences between proportions of agreements to an attitude question 
for the same individuals tested on two occasions, we obtain а 2 = 3.16. The 
difference is significant at better than the 1 per cent level. The value of the 
probability is .0016. The value of x? calculated on the same data is (3.16), 
or 9.986. The probability is the same as before. 


13.6 SMALL EXPECTED FREQUENCIES 


The distribution of X! used in determining critical significance values is a 
continuous theoretical frequency curve. Where the expected frequencies 
are small, the actual sampling distribution of X* may exhibit marked dis- 


Served and expected values 
closer together and decreases the value of хе, This correction should be 
S is less than 5, and some 
d that the correction should 
ncies the correction will be 


writers suggest 10. Indeed, it may be argue: 
always be used. For large expected freque 
negligible. 


The formula used in computing x? from a 2 x 2 table can be written to 
incorporate Yates’s correction for continuity, This formula becomes 


пы e= N(\AD — BC| — N12)? 
4 (4+ B(C E D(A* C(B3.Dj 
s term VÉ is the absolute difference, that is, the difference 
aken regardless о sign. The correcti, i 
rii sog ate ection amounts to subtracting N/2 from 
The following data show t 
for a group of 20 Protestant 


Chosen 
= ы _ 
Protestant Jewish 
5 Jewish 3 5 8 
$ 
2 
8 Protestant 10 2 12 


13 7 20 


13.7 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 189 
The value of x? using Yates's correction is 


ү: — 2006—50] = 3 _ 
8х12х18х7 299 
This value falls at about the 10 per cent level. The evidence does not justify 
the rejection of the hypothesis that sociometric choice is independ i 
whether the child is Jewish or Protestant. We note that if x? iod sii я 
on these data without Yates's correction, a value x? = 4.43 is кт wo 
Bn value falls at about the 3.5 per cent level. If Yates's СНОВ had nol 
te result would be considered significant at better than the 5 
With 2 or more degrees of freedom the error introduced by small 
expected frequencies is of less consequence than with 1 degree of freedom 
An expectation of not less than 2 in each cell will permit the estimation of 
roughly approximate probabilities. If the frequencies are 5 or more, good 
approximations to the exact probabilities are obtained. With cerit em 
of data it is a common practice to combine frequencies. In testing the 
goodness of fit of a theoretical to an observed frequency distribution, small 
frequencies at the tails may be combined. On occasion it may be possible 
without serious distortion of the data to combine rows and columns of a 
contingency table to increase the expected cell frequencies. і 
With 1 degree of freedom where the expected frequencies are small, an 
exact test of significance may be applied. This involves the determination 
of exact probabilities, as distinct from those estimated from the continuous 
x? curve. An exact test of significance for a 2 X 2 table is described in З 


Chap. 22 of this book. 


MISCELLANEOUS OBSERVATIONS ON CHI SQUARE 


In this section we shall consider a number of miscellaneous points about x? 


not hitherto discussed. 


One-tailed and two-tailed tests Tables of x? used for tests of significance are 
based on one tail only, the tail to the right, of the sampling distribution of 
x2. Table С of the Аррепаї 
of the area of the distribution falls to the right of x? = 3.84 and 1 per cent to 
the right of x? = 6.64. These are not critical values for directional, or 
one-tailed, tests as described in Chap. 11. Although one tail only of the 
sampling distribution of x? is used, the tabled values are those required for 
testing the significance of a difference regardless of direction, that is, for 
two-tailed tests. The critical ratio or normal deviate required for signifi- 
cance at the 5 per cent level for a two-tailed test is 1.96. If this value is 
squared, we obtain 3.84, the x? value at the 5 per cent level for 1 degree of | 


edom. For 1 degree of freedom the square root of x? is a normal deviate 
ference to the normal curve in applying two-tailed | 


he square of the normal deviate for 1 degree 
al curve are incorporated in the right tail 
here x? is applied, the idea of a direc- 


fre 
and may be used with ге! 


tests. In effect, because x is t 
of freedom, both tails of the norm 
of the x? curve. In many situations W 


x shows that for 1 degree of freedom 5 per cent | 


[13.5] 


BASIC STATISTICS 


tional, or one-tailed, test has little meaning. In tests of goodness of fit and in 
most tests of independence we are usually not concerned with the direction 
of the difference observed. If a one-tailed test is required, the proportionate 
areas in the chi-square tables should be halved. The value of x? required 
for significance at the 5 per cent level for a one-tailed test is 2.71 for df= 1. . 
The corresponding value at the 1 per cent level is 5.41. These are the 
squares of the normal deviates 1.64 and 2.33 required for significance for a 
one-tailed test at the 5 and 1 per cent levels, respectively. 


Chi square and sample size The value of x’ is related to the size of the 
sample. If an actual difference exists between observed and expected 
values, this difference will tend to increase as sample size increases. X? will 
also increase, and the associated probability value will decrease. Consider 
the following tables: 


1 2 3 
6 4 | 10 12 8 | 20 24 16 40 
4 11 |15 8 22 |30 16 44 60 
10 15 25 20 30 50 40 60 100 
х = 2.78 х = 5.56 ҳ = 11.12 


expected values, x? will tend to remain unchanged аз sample size 
increases. For a constant difference between observed 
values x? will decrease as sample size increases, If we d 
and hold the difference between observed and ex 
value of x? will be reduced by one-half, 


Alternative formula for chi square We can readily demonstrate that 
(0 — Ey 
x = > ——— 


ELEN 
This alternative way of writing X? is sometimes u 
purposes. 


seful for computational 


2 x 2 tables with more than 1 degree of freedom For most 2 x 2 tables the 
row and column totals are considered fixed and 1 degree of freedom is 
associated with the variation in the data, Situations arise where either the 
row or column totals, or both, are free to vary. In a sociometric study on a 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE 191 


class of 8 Jewish and 12 Protestant children, each child may be asked to 


choose one other child with whom 


choose himself. If choices are indepen 
Protestant, what are the expected frequencies of choices? On a strictly 


random basis, how many Jewish choosers will make Protestant choices 
and so on? Since a child cannot choose himself, a Jewish child ро " 
from among seven Jewish and 12 Protestant children. The probability E 
Jewish child choosing a Jewish child in making a choice at random is 1% 
and of choosing a Protestant child 1$. Since eight choices are made, the 
expected frequency of Jewish choices is тз X 8 = 2.95. The expect 
frequency of Protestant choices is 1$ X 8 = 5.05. Likewise, we find that the 
expected frequency of Jewish choices by Protestant children is 15 X 12 = 

5.05, and the expected frequency of Protestant choices is 44 X 12 = 6.95. | 
The expected frequencies are tabulated below together with the abser k 


frequencies. 


Expected Observed 
chosen chosen 
P J 


Chooser 


= 
Chooser 
ч 
ә 


20 14 


xed. Тһе column totals are free to 
In these data we observe a tendency for both Jewish 
to choose Protestant children more frequently 
than expectation. In this case a X? based on a comparison of the expected 
and observed cell frequencies has 2 degrees of freedom. д? may of coursi 
be applied to the observed frequencies in the usual way with 1 degree of 
freedom. This is a test of association, within the restrictions of the margina 

totals, of the religious affiliation of the choosers and the chosen. It is nota 
test of randomness of choice. Fourfold tables may occur where both row 
and column totals are free to vary. Such tables arise where all expected 
frequencies are derived in a manner entire dent of the data. 


x? here has 3 degrees of freedom. 


In this example the row totals are fi 


vary from expectation. 
and Protestant children 


ly indepen 


Reduction of an R A table with R rows and. 
columns may be reduced to a 2 X 2 table in order to facilitate a rapid test í 
association with x*. This procedure is legitimate enough provided the 
points of dichotomy of the two variables are made without reference to the 
cell frequencies. The investigator may decide a priori to dichotomize about - 
Data are found where the points 


the two medians, ог something of the sort. 
of dichotomy have been located in order to maximize the association in the 


x € table to a 2X 2 table 


192 


BASIC STATISTICS 


data and obtain thereby a significant x?. This practice is spurious and 
should be enthusiastically discouraged. 


Me ИНН 


EXERCISES 1 In 180 throws of a die the observed frequencies of the values from 1 to 6 


are 34, 27, 41, 25, 18, and 35. Test the hypothesis that the die is unbi- 
ased. 


А psychological test yields a distribution of scores as shown below. The 
theoretical normal frequencies are also shown: 


Class Observed Theoretical 
interval frequencies frequencies 
90-99 1 } 8 
80-89 5 
70-79 17 17 
60-69 30 31 
50-59 - 50 39 
40-49 35 34 
30-39 10 20 
20-29 6 8 
10-19 4 
0-9 - Ч 
160 160 


Test the goodness of 


fit between the theoretical and the observed 
frequencies. 


wW 
and two columns, (b) two rows and three columns ^ 


five columns? Assume fixed marginal totals, 


4 The following data relate to patients in a mental hospital: 


Rating 


Improvement No improvement 


Therapy A | 16 28 44 


Тһегару В 9 37 46 


25 65 90 


THE ANALYSIS OF FREQUENCIES USING CHI SQUARE ics. 
Test the hypothesis that method of th ependent o 
erapy is ш4ер i 
ру f rating 


The following contingency table describes the relation between scores - 
above and below the median on an examination and ratings of job per- 


formance for 100 employees. 


Rating 


Below Above 


average Average average 


Above median 71 
Below median 29 
100 


(a) 


Ungentled animals 


(b) 


Test the hypothesis that job performance is independent of examina- 


tion results. 

survey contains 100 males and 100 females. 
females 18 state a preference for brand A. 
o sex differences exist in consumer 


A sample used in a market 
Of the males 33 and of the 
Use x? to test the hypothesis that n: 


preference. 


2 for the following tables, using Yates's correction for conti- 


Calculate X 
nuity: 


Weight 
E ——— M 
Increase No increase 


Gentled animals 


Locus of lesion 


Impairment in 
performance 


No impairment in 
performance 


ЖИЕ DESIGN OF EXPERIMENT 


ES 
Г-У: 
$ 
LI 
D ж 
"x 
й 
- 
E E | 
" s D 
« = 


14.1 


THE STRUCTURE AND PLANN 
OF EXPERIMENTS 


INTRODUCTION 

Several subsequent chapters of this book are concerned with the analysis 
of variance and covariance. These procedures are used in the analysis of 
the data of experiments. A very brief, and elementary, discussion of the 
structure and planning of experiments should serve as a useful preliminary 
toa detailed study of these methods of analysis. The study of the structure 
and planning of experiments is a field of investigation commonly called.the 
design of experiments. This subject has many aspects, some quite 
complex. The present discussion will deal in a nonmathematical way with 


only a few of the simplest aspects of experimental design. 
All experiments are concerned with the relations between variables. In 


the simplest type of experiment 
independent variable and a depend 
ment may be initiated to compare t 
French, designated A, B, and C. Each method may be app 
group of experimental subjects. Following a period of instruction, perform- 
ance may be measured using an achievement test. In this experiment the 
different methods of teaching French constitute the levels or catégories of 
the independent variable. The investigator decides which methods will be 
used and the size of the groups to which they will be applied: that is, he 
controls the values or categories of the independent variable and the 
e of those value 

achievement constitute the dependent variable. The experiment is con- 

hich achievement in French depends on the 


cerned with the way in w ] d 
method of instruction used. The essence of the idea of an experiment lies 
197 


ent variable. To illustrate, an experi- 
hree different methods of teaching 


frequency of occurrence 


two variables only are involved, an. 


s or categories. The measures of | 


lied to a different 


198 


14.2 


THE DESIGN OF EXPERIMENTS 


in the simple fact that the investigator selects the values or categories of 
the treatment variable and the frequency of their occurrence. This enables 
him to study an indefinitely large number of relations which are not 
amenable to study by observational or correlational methods, and may not 
in fact have any existence in nature at all prior to the conduct of the experi- 
ment. New knowledge is thereby produced. The tremendous proliferation 
of human knowledge in the past 20 years is in large measure a result of 
experimentation. It is clearly desirable, therefore, that some of the princi- 
ples underlying such experimentation should be understood. 

In developing the design for an experiment, the investigator must (1) 
select the values or categories of the independent variable, or variables. to 
be compared; (2) select the subjects for the experiment: (3) apply rules or 
procedures whereby subjects are assigned to the particular values or 
categories of the independent variable; (4) specify the observations or 
measurements to be made on each subject. In the experiment mentioned 
above, on the comparison of different methods of teaching French, the 
investigator must select the different methods of teaching to be compared. 
He must choose the experimental subjects to which these methods are 
applied. He must allocate subjects to methods. Also decisions must be 
made regarding appropriate measures of achievement which will yield 
valid comparisons of the methods used. 


TERMINOLOGY 


Comment on terminology is appropriate here. An independent variable 


used In an experiment may be either a treatment variable or a classification 


ferent groups of subjects. In effect, the subjects are treated in some way by 
the experimenter, Experimental subjects may, however, be classified on a 


4 ў 5 mples are sex, age, disease 
entity, IQ level, socioeconomic status, and so on. Although the values of a 


classification variable are not, as it were, created by the experimenter, as is 
the case with a treatment variable, the investigator nonetheless selects the 


classification variables which are included in the experiment or are the 
objects of attention. 


" 


14.3 


uev 


1 
THE STRUCTURE AND PLANNING OF EXPERIMENTS 199 


independent variable are spoke ; 
three, or more levels of a ae ES EE : 
In the literature on experimental design the unit to which а treat i Э. 
applied is frequently spoken of as a plot, a term which derives from ade a г 
tural experimentation. In experimental work in psychology and и И | 
the plot is usually a human subject or an experimental animal. The term | 
plot” will be used infrequently in this text. Measurements obtained froma 
plot are sometimes spoken of as the yield, a term which stems also from | 
agricultural usage. In psychology and education the measurements бу . 
observations made on a group of subjects or animals correspond to the | 
yield Гог a number of plots. A grouping of experimental units which is 
homogeneous with regard to some basis of classification is spoken of as ac 
block. An experimenter may select 50 subjects, of which 25 are male and 25 7 


female. The two groupings by sex are blocks. Py 


THE CLASSIFICATION OF VARIABLES IN i 
RELATION TO EXPERIMENTS 4 
As indicated above, experiments аге concerned with the relations between | 


variables. In Chap. 1 of this book variables were classified as nominal | 
a 
> 


1 


ordinal, or interval-ratio types. In a simple two-variable experiment the 
treatment variables may be nominal, ordinal, or interval-ratio; the depen- - 
dent variable may also be nominal, ordinal, or interval-ratio. The methodan 
to be applied in the analysis of the data of experiments and the type of 
iments can answer are determined by the nature of - 
ffect, the nature of the information which an exper- 
ytic methods which communicate that infor- 
mation to our understanding, depend on the nature of the variables. 

To illustrate, consider an experiment in which three types of therapy are 
applied to three groups of depressed patients. After a time period. observa- 
tions are made indicating those subjects that show evidence of recovery | 
and those that do not. In this experiment both variables are nominal. The | 
investigator may calculate the proportions in the two groups that show | 
recovery and compare these proportions one with another. The magnitude | 
of the differences between proportions is a measure of the difference | 
between treatments. No further analysis of these data is possible. Con- 
sider, on the other hand, an experiment in which both the treatment vari- 
able and the dependent variable are of the interval or ratio type. The treat- - 
ment variable may consist of five equally spaced dosages of a drug, and 
the dependent variable may be reaction time. Here the investigator may not 
only compare the mean reaction times for each dosage with every other | 
dosage, but he may explore the nature of the functional relation between. 
the two variables. Reaction time may increase or decrease in a linear. 
fashion with change in the treatment variable; it may increase and then | 
decrease; or some other type of relation may exist. Here we have consid- 4 

4 


questions which exper! 
the variables. Thus, in e 
iment can yield, and the anal 


200 


THE DESIGN OF EXPERIMENTS 


ered two experiments. In one experiment both variables were nominal; in 
the other, both were either interval or ratio. In many experiments the treat- 
ment variables may be nominal, and the dependent variable may be of the 
interval or ratio type; or the treatment variable may be ordinal, and the 
dependent variable may be nominal, and so on. The nature of the variables 
determines the method of analysis employed and the nature of the conclu- 
sions drawn. 


SINGLE-FACTOR EXPERIMENTS 


Many experiments involve a single-treatment, or classification, variable 
with two or more levels. Let us initially consider experiments in which the 
single factor is a treatment variable with А levels or categories, and not a 
classification variable. Such experiments are of a variety of types, some of 
which are discussed here. First, a group of experimental subjects may be 
divided into k independent groups, using a random method. A different 
treatment may then be applied to each group. One group may be a control 
group, that is, a group to which no treatment is applied. A meaningful 
interpretation of the experiment may require a comparison of results 
obtained under treatment with results obtained in the absence of treat- 
ment. Comparisons may be made between treatments and a control, 
between treatments, or both. Second. some single-factor experiments 
involve a single group of subjects. Each subject receives all А treatments. 
Repeated observations or measurements are made under k conditions, one 
of which may be a control condition. on the same subjects. In such an 
experiment as this the measurements made under the А treatments will 
not be independent. Positive correlation will usually exist between the 
paired measurements obtained underanytwo treatments. These correlations 
will reduce the magnitude of the error used in the comparisons of the sepa- 
rate treatment means. Third, a single-factor experiment may consist of 
groups that are matched on one or more variables which are known to be 
correlated with the dependent variable. In an experiment on three different 
methods of teaching French IQ may be known to be correlated with 
achievement in French. Three groups of subjects, paired or matched 
subject by subject by IQ, may be used. The rationale here is that because 
IQ is correlated with French achievement. 


one or more control variables. This results in the same error reduction as 
the individual matching of subjects (see McNemar, 1969), 


14.5 


THE STRUCTURE AND PLANNING OF EXPERIMENTS 201 


RANDOMIZATION 

In the design of experiments frequent use is made of randomization. In 

f randomization is to ensure that extraneous variables 

t with the dependent variable, and may be correlated ` 
with it, will not introduce systematic bias in the experimental results. Con- 
sider again the experiment in which three methods of teaching French are 
compared. If subjects are assigned to the three groups using a random | 
method, IQ, which may be correlated with French achievement, will not 
from group to group. Also, an indefinitely large” 
number of other variables, the specifications of many of which may not - 
have entered the purview of the investigator. will be rendered powerless to 
introduce systematic bias. Although methods of randomization differ ion 
one type of experiment to another, the general purpose of randomization is — 
to protect the validity of the experiment by controlling the biasing influ- 
ence of extraneous variables. A quotation from Cochran and Cox (1957) is 


general the purpose о 
which are concomitan 


vary in any systematic way 


relevant here. 
ous to insurance, in that it is a precaution against distur- 


bances which may or may not occur and that may or not be serious if they do occur. It is gen- 
erally advisable to take the trouble to randomize even when it is not expected that there will 
from failure to randomize. The experimenter із thus protected against 


pset his expectations. 


Randomization is somewhat analog: 


be any serious bias 
unusual events that и 


m “randomization” refers not to any sub- | 


jective impression of haphazard arrangement but to clearly stated opera- - 
tional procedures. These procedures may involve the tossing of coins, the 
drawing of numbered cards from a well-shuffled deck, or the use of tables. 
of random numbers. Such tables are composed of series of digits from zero 
to nine. Each digit occurs with approximately equal frequency, and adja- 
cent digits are independent of each other. Tables of random numbers are 


ble in Fisher and Yates (1963), and elsewhere. To illustrate, suppose 
tal subjects from a sample of40 


rs 1 to 40. Enter a row, 


In experimental design the ter 


availa 
we wish to select a sample of 12 experimen 
subjects. Identify the subjects by the numbe; 
column, or diagonal of two-digit numbers in a table of random numbers, | 
and select the first 12 different numbers between 1 and 40 as they occur. 
This procedure will yield a random set of 12 subjects. Other procedures 


may, of course, be used. 

па single-factor experiment with k independent groups ofnp паь. 3, 
n, subjects, ап appropriate procedure is to choose л, subjects at random 
for the first group, пг at random for the second, and so on. This procedure is | 
continued until n subjects remain for the kth group. In an experiment іп. 
hk experimental treatments are administered to the same group of n 
d measurements are made on the same subjects, an 
t; that is, the result observed may not be independent 


t for reason of fatigue, practice, or some other - 


whic 
subjects and repeate! 
order effect may exis 
of the order of treatmen 


14.6 


THE DESIGN OF EXPERIMENTS 


cause. For example, measures of reaction time may be made on the same 
group of subjects under four different dosages of a drug. Here quite clearly 
an order effect may exist. In such an experiment as this the order of treat- 
ments may be randomized for each subject, thereby eliminating any sys- 
tematic influence of order on the treatment means. When there are two 
treatments only, 4 and B, а common practice is to use the order AB for half 
the subjects and the order BA for the other half. assigning subjects to 
orders at random. In experiments involving matched groups, the members 


of each matched pair, triplet, or quadruplet may be allocated to treatments 
by a random method. 


FACTORIAL EXPERIMENTS 


The experiments discussed hitherto in this chapter have involved a single 
independent variable or factor. Experiments may, however. be designed to 
study simultaneously the effects of two or more independent variables. For 
example, the experiment in which three methods of teaching French are 
compared may be extended to include а comparison of spaced versus 
massed learning. Under the spaced conditions, subjects receive short 
periods of intensive instruction separated by time intervals. Under the 
massed learning conditions, subjects receive intensive instruction for a 
prolonged time period. The effect on achievement of the six possible com- 
binations of three methods of teaching and two learning conditions may be 
investigated, each combination being applied to a different group of experi- 
mental subjects. Such an experiment is called a factorial experiment. 
Experiments in which the treatments are combinations of levels of two or 
more factors are said to be factorial. If all possible treatment combinations 
are studied, the experiment is said to be a complete factorial experiment. In 
some experiments the factors have two levels only. We may speak of a 


2X 2 experiment, a2 x 2x 2 experiment, or a 2" experiment, where n is 
the number of factors, 


test scores obtained. 


Learning 

condition A B с 

— 2. 

2 72 34 78 29 16 29 

Spaced 96 55 25 20 24 4) 
81 85 75 45 19 19 

Massed 99 62 36 26 4l 46 


? ж 
THE STRUCTURE AND PLANNING OF EXPERIMENTS 205 


№ variables are involved in this experiment. Two of these. method 
earning condition, are independent variables, and one. "achi X 
ment-test score, is the dependent variable. Three varit value de E 

able for each subject—the method used, the learning ВЫ | 
achievement-test scores. Method and learning condition are са ag / 
ables. These data could readily be written in columnar fashion as foll 


Learning Teaching Test 


Learning Teaching Test 
condition method B 


condition method score 


5 A 72 M А 5i 
S A 96 M A o 
5 A 34 M 7 85 - 
S A 55 M А 62 
5 В 78 M B 75 
5 В 25 М В E 
S B 29 M B 4. 
5 В 20 M B E. 
S с 16 М с 19 3 
S С 24 М C a 
5 с 29 M с да 
C 41 M с 46 


т” 


The purpose of this experiment is to explore the relation between! th 
learning condition and test scores and the teaching method and test scores. 
The relations between the six combinations of learning condition am" 
teaching method may also be studied. Àn important feature of the struc- 
ture of this experiment is that the two nominal variables, learning condition. 
and teaching method, are independent of each other. By choosing six 
groups of equal size, the independence of these two variables is assured. If 
the numbers of cases in the six groups. when written in the form of a 2 x3 
contingency table, are proportional to the marginal totals, and x? for the 
table is zero, the independence of the two treatment variables will also be 
assured. In the design of factorial experiments the groups should be eithei 
of equal size or proportional. Departures from equality or proportionalii 


should be avoided. 

One advantage о 
about the interaction b 
teaching French one m 
learning and render that com 


combination. The concept of interac 
in detail, in Sec. 16.5. One disadvantage of the factorial experiment is thi 


the number of combinations may become quite unwieldy, and. from a pra 
tical point of view, the experiment may be very difficult to conduct. Also, 


ent is that information is obtained 


etween factors. For example, in the experiment on 
ethod of teaching may interact with a condition of. 

bination either better or worse than any othe: г 
tion is discussed more explicitly, and 


f the factorial experim: 


204 THE DESIGN OF EXPERIMENTS 


the meaningful interpretation of the interactions may prove difficult. In 
general, in psychology and education, it is usually advisable to avoid facto- 
rial experiments with more than a few factors. 

The reader should note that factorial experiments may involve repeated 
measurement on the same subjects. For example, in a 3 X 2 design, 
repeated measurements may be made on the same subjects under each of 
the six treatment combinations. The result is an V X 3 x 2 arrangement of 
numbers. 


14.7 OTHER EXPERIMENTAL DESIGNS 


Situations occur where a basis exists for the classification of experimental 
subjects into r subgroups or blocks. An experiment may be conducted 
involving / treatments, where the number of subjects used is such that 
N — rk. Subjects in each block may be assigned at random to treatments, 
under the restriction that each treatment occur once only in each block. 
Such an experiment is a randomized block experiment. To illustrate, an 
experiment involves a comparison of four treatments. A sample of 24 
experimental animals is used, which consists of six sets of four litter mates. 
Each set of litter mates is a block. Treatments are allocated to the animals 
^ in each block, or vice versa, on a random basis. If we designate the four 


treatments as 4, B, C, and D, the allocation of treatments to subjects may 
be as follows: 


Animal within block 


Block 1 2 3 4 
--------- 
1 A D [^ B 
2 с A B D 
3 A B D [^ 
4 B [^ D A 
5 D B A с 
6 р A с В 


In previous discussion of order effects 
involving repeated measurements on the sa 
that the order of treatment be randomized fo: 
a randomized block design, where the subje 


in single-factor experiments 
me subjects, it was suggested 
r each subject. This is, in fact, 


ct becomes the analogue of the 
block, and the order of presentation of treatment conditions А, B, C, Dis 


represented by column headings. 

Experiments may be considered in which the number of blocks is equal 
to the number of treatments or a multiple thereof; thus № = k? or some mul- 
tiple of 42. In such experiments each treatment occurs once only in each 


14.8 


1 


THE STRUCTURE AND PLANNING OF EXPERIMENTS 205. 


d once only over blocks. Thus a particular arrangement may be: — 


block an 

A B D с 5 
р A C B : A 
C D B A Y 
B с А р :% 


This is а Latin square design. A Latin square is simply an arrangement in | 

ch treatment occurs once only in each row and column. Tables of 4 
Fisher and Yates (1963). These authors suggest | 
atin square at random from sets of possible Latin 4 
han one Latin square, they | " 


which ea 
Latin squares appear in 
methods of choosing а L 
squares. When an experiment requires more t 


must be chosen independently. 
| 


} 


ARIABLES 


CLASSIFICATION У. 

4 experiments in which the independent vari | | 
‘4 
4 


Hitherto we have discusse 
ables are treatment varia)! 


education the ind 
fact, classification variables; that is, su 
characteristic which was present prior to the conduct of the experiment 


and did not result from the manipulations of the investigator. For example, | 
normals, neurotics, and psychotics may be compared on flicker-fusion 
rates, or some other variable. Here the independent variable is a classifica- 
tion, and not а treatment, variable. No treatment by the investigator is 
involved. Quite clearly in such an experiment it is not possible to assign 

subjects to experimental groups at random. Randomization is not possible, > 
because the attribute which determines membership in a particular group. 


is not under the control of the investigator. 
Three types of si ification variables are involved 


tuations in which classi 
may be recognized. First, the sample of subjects used may be a random 
sample drawn from a defined population, and the proportions in the various | 
classes, or strata, may. within the limits of sampling error. correspond to 
the population proportions. The object of the investigation is to describe 
relations between the classification variable and some other variable. An 
investigation of this type is. in effect, not an experiment at all, but a correla- 
tional study. Second, samples may be drawn for comparison at random а 
from two or more subpopulations, but the proportions in these samples | 
may not correspond to the population proportions. For example, an inves- г 
tigator may choose to compare 50 normals with 50 psychotics on а | 
specified variable or variables. Clearly the proportions in the two samples | 
do not correspond to the proportions of normals and psychotics in the pop- ¥ 
ulation. This, in effect, is a form of disproportional st 1 


ratified sampling. Also . 
we note that certain classes which exist in the population at large may be | 
Я all nonpsychotics are 


excluded. Thus, for example, по necessarily normal. | 
This again is, in effect, a correlational study, the object of which is simply 
to identify differences between groups. Frequently there is à presupposi- 


les. In many investigations in psychology and 
ependent variables are not treatment variables but are, in 


bjects are classified according to a 


E 
и 
D 
* 


з Уе] 
бы Сан ағ ыты азы, 1 


THE DESIGN OF EXPERIMENTS 


tion that a knowledge of such differences may ultimately prove useful in 
the construction of causal arguments. Third, investigations may be con- 
ducted involving classification variables in which the attempt is made to 
control the influence of certain variables which in a straightforward corre- 
lational study would not be controlled. In comparing hospitalized normals 
with hospitalized psychotics, the two groups may be equated for age, IQ. 
sex, length of hospitalization, socioeconomic status, and other variables 
which might be construed to be correlated with the dependent variable. In 
studies such as this the investigator excludes certain variables from the 
group of causal influences affecting the results observed. Although no 
direct causal argument linking the classification variable and the depen- 
dent variable can be advanced, the influence of certain variables on the 
results observed can be excluded. Such investigations narrow the range of 
possible causal influences. Factorial investigations involving classification 
variables may perhaps be construed legitimately to be experiments in that 


by design they involve the experimental control of certain variables which 
otherwise might be uncontrolled. 


CONCLUDING OBSERVATION 


Discussion in this chapter has touched briefly on a few aspects only of the 
structure and planning of experiments. This subject can be elaborated at 
great length and complexity. For a more comprehensive discussion the 


reader is referred to Cochran and Cox (1957), Cox (1961), Finney (1960), 
and Winer (1962). 


l Distinguish, with examples, 


i between (a) an independent and a depen- 
dent variable and (/) a treat 


ment and a classification variable, 
2 Discuss the rationale underl 


s ying the use of matched samples in the 
planning of experiments. 


3 Why does randomization eliminate s 


ystematic bias in experimental 
results? 


4 Outline a randomization procedure for allocating 100 experimental sub- 
jects to five experimental groups. 


5 А group of 10 experimental subjects receives four treatments. 


СЕН Певсгіһе 
а procedure to eliminate order effects. 


6 What is the rationale underlying the use of equal or proportional groups 
in factorial experiments? 


THE STRUCTURE AND PLANNING OF EXPERIMENTS 2 


What is meant by a randomized block experiment? 


Give an example of a Latin square containing five rows and five 


columns. 


How would you distinguish between an experiment and a correlational 


study? 


208 


15.1 


ANALYSIS OF VARIANCE: 
ONE-WAY CLASSIFICATION 


ITS NATURE AND PURPOSE 


ments. Obviously, if we aret 
ticular causal circumstances, experime 
to occur in a logically rigorous fashion 

The partitioning of variance 
particular body of technology К 


, the analysis of 
ide this variance into additive parts. The 
into additive parts. These 


ANALYSIS OF VARIANCE: ONE-WAY CLASSIFICATION 209 


fluctuation is expected between means. If the variation cannot reasonably 
be attributed to sampling error, we reject the null hypothesis and acce i | 
the alternative hypothesis that the treatments applied are having an и 
With only two means, Ё = 2, this approach leads to the same result as that 
obtained from the ¢ test for the significance of the difference between 
means for independent samples. 

Consider an agricultural experiment undertaken to compare yields of 
four varieties of wheat. Thirty-two experimental plots are prepared, and | 
each of the four varieties grown in eight plots. Thus k= 4 and n=8. 
Assume that appropriate precautions have been exercised to randomize 
uncontrolled factors such as variation in soil fertility from plot to plot. The | 
yield for each plot is obtained, and the mean yield for each variety on the 
eight plots calculated. Differences in yield reflect themselves in the varia- 
tion in the four means. If this variation is small and can be explained by 
the investigator has no grounds for rejecting the null 
difference exists between the yields of the four varieties. 
is not small and of such magnitude that it 
could arise in random sampling in less than 1 or 5 per cent of cases, then 
the evidence is sufficient to warrant rejection of the null hypothesis and 
acceptance of the alternative hypothesis that the varieties differ in yield. 

In the above agricultural experiment the sampling unit is the plot. In psy- 
chological experimentation the analogue of the plot is usually either a 
human subject or an experimental animal. In an experiment on the relative 
efficacy of four different methods of memorizing nonsense syllables, four 
groups of subjects may be selected, a different method used on each group, - 
and means on a measure of recall obtained for the four groups. A compari- 
son of these means provides information on the relative efficacy of the dif- i 
ferent methods, and the analysis of variance may be used to decide | 
er the variation between means is greater than that expected from 


оп. 
the significance of the differences between. a 


s designed to study the variation 4 
an independent variable. The 

heat, methods of memorizing | 
ditions. The dependent | 


sampling error, 
hypothesis that no 
If the variation between means 


wheth 
random sampling fluctuati 
The problem of testing 
number of means results from experiment 
a dependent variable with variation in 
may he varieties of w 
different environmental con 
yield, number of nonsense syllables recalled, or 
by an animal in running a maze. Experiments which 
riable are said to involve one basis of classifica- 
be used in the analysis of data resulting | 
an one basis of classification. For 
d to permit the study both of 
p yield. This experiment 
discover how crop yield 


in 
independent variable 

nonsense syllables, or 
variable may be crop 
number of errors made 
employ one independent уа: 
tion. The analysis of variance may 
from experiments which involve more th 
example, an experiment may be designe 
varieties of wheat and types of fertilizer on cro] 


employs two independent variables. We wish to 
depends on these two variables. The analysis of variance may be used to 


extract a part of the total variation resulting from the differences in 
varieties of wheat and another part resulting from differences in fertilizers, , 
in addition to interaction and error components. A further example is a psy- 


210 


THE DESIGN OF EXPERIMENTS 


chological experiment designed to permit the study of the effects of both 
free-versus-restricted environment and early-versus-late blindness on 
maze performance in the rat. Here we have two independent variables. 
Each variable has two categories. There are four combinations of condi- 
tions: free environment and early blindness, free environment and late 
blindness, restricted environment and early blindness, restricted environ- 
ment and late blindness. Four groups of experimental animals may be 
used, and one of the four conditions applied to each group. The analysis of 
variance may be applied to identify parts of the variation in maze perform- 
ance assignable to the different environmental and blindness conditions. 


and other parts as well. Experiments may be designed to permit the simul- 
taneous study of any number of experimental variables within practical 
limits. 


Let us proceed by considering in detail the simple case of a 
one-way-classification problem where the analysis of variance provides а 
composite test of the significance of the difference between a set of means. 


15.2 NOTATION FOR ONE-WAY ANALYSIS OF VARIANCE 


Consider an experiment involving k experimental treatments. The treat- 
ments may be different dosages of a drug, different methods of memorizing 
nonsense syllables, or different environmental variations in the rearing of 
experimental animals. Each treatment is applied to a different. experi- 
mental group. Denote the number of members in the А groups by пі, 
п»... у ng. The number of members in the jth group is пу. The total 
number of members in all groups combined isn tnt- --+щ=М 
When the groups are of equal size we 

` © = лк = п = N]k. The data тау be represente 


may write лу =m = 
d as follows: 


Group 1 Group 2 Group k 
OAAR 
Хи Ха Хи 
Xa Xn Xx, * 
Ха Ха Жақ 
Хан Хы Leon 


n na 


Уу Xa > Ха У Хи 
1 ғ 
——— 


Here a system of double subscripts is used. The first subscript identifies 
the member of the group; the second identifies the group. Thus X», repre- 


15.3 


ANALYSIS OF VARIANCE: ONE-WAY CLASSIFICATION 211 т 


sents the measurement for the second member of the first group, X32 repre- 
sents the measurement for the third member of the second group, and so 

on. In general, the symbol Ху means the ith member of the jth group. ( 
Where the data for each group аге tabulated in a separate column, the fay 
subscript identifies the row and the second the column. The sum of 


measurements in the k groups is represented by 
v 


on n. 


$ Ха, Ў Хь +++» Xu 
ізі 


ігі ігі 


We may denote the group means by X» Х.....Х.. The symbol XM 
he mean of the second column, 


refers to the mean of the first column, X; t 
and X, the mean of the jth column. The convention is to use a dot to 
у 


indicate the variable subscript over which the summation extends. The | 
mean of all the observations taken together may be represented by the sym- А 
bol X „ sometimes called the grand mean. In a one-way classification the 
with the various symbols is quite clear without the use 
of the dot notation. In discussion of one-way classification we shall there- | 
Гоге simplify the notation and represent the group means by ХХ. «MG 
X, and the grand mean by X. The dot notation is necessary in the more 
complex applications of the analysis of variance, and we shall return to it in 


Chap. 16. 
The total variation in the 


deviations of all the observations 
of deviations of the л, observation 


Ln 


is У (Xn — X)? and the sum of squares 0 
ізі 


meaning associated 


data is represented by th 
from the grand mean. | 
s in the first group from the grand mean 


f the n; observations in the jth 


п, i 
group from the grand mean is > (Xy— X )?. For k groups each comprised | 


ізі - 
of п; observations the total sum of squares of deviations about X is 


kon A 
> > (Ху =X) 

ізі ізі 

When the meaning is clearly unders' 
practice to represent this total sum 
simply by Z(X — » 9b 


jt is common | 


tood from the context, it n 
X)?, or more 


of squares by X(Xj— 


F SQUARES 
be used to demonstrate that the total sum of squares 
two additive and independent parts, à within-group 
a between-group sum of squares. We proceed by 
„-Х)= (Xy — X) + (0-Х): This identi 
states that the deviation of a particular score from the grand mean is 
d of two parts, а deviation from the mean of the group to which ч 
(Xy — Xj) апда deviation of the group mean from the 


X). We square this identity and sum over the п; cases in ^ 


PARTITIONING THE SUM © 
Simple algebra may 
may be divided into 


sum of squares and 
writing the identity (X 


comprise 
the score belongs 
grand mean (X,— 


212 


[15.1] 


THE DESIGN OF EXPERIMENTS 


the jth group to obtain 
Š 0-0) У (Х„—Х)#+Ў (HF)? 
ізі ізі ізі : 
*2(X,— X) У (Xy— X) 
ізі 


The second term to the right requires the summation of a constant 
(X;— X)* over all n; values of the jth group and may be written 
nj(X; — X)*. The third term to the right disappears because the sum of 
deviations about the mean X; is zero. We obtain thereby 


Y Ou- X Ў (Ху X)  n(X,- X 
ізі іі 


This expression says that the sum of the squares of the deviations of the л, 
observations in the jth group from the grand mean X is equal to the sum of 
squares of deviations of the observations from the group mean plus п; times 
the square of the difference between the group mean and the grand mean. 
We now sum over the k groups to obtain 


n kom E к ae 
Ў Ў (к-У Ў 0x Y п(,- X) 
ja 1 ju del ja 

The term to the left is the total sum of squares: the sum of squares of all the 
observations from the grand mean X. The first term to the right is the sum 
of squares within groups: the sum of squares of deviations from the respec- 
tive group means. The second and last term to the right is the sum of 
squares between groups: the sum of squares of deviations of the group 
means from the grand mean, each term (X,— Ху? being weighted by nj, 
the number of cases in the group. Thus the total sum of squares is parti- 
tioned into two additive parts, a sum of squares within groups, and a sum of 
squares between groups. These two parts are independent. 


THE VARIANCE ESTIMATES OR MEAN SQUARES 


Each sum of squares has an associated number of degrees of freedom. The 
total number of observations is тт +. n= En; = М. The total 
sum of squares has V — 1 degrees of freedom. One degree of freedom is 
lost by taking deviations about the grand mean. № 
are free to vary. The number of degrees of freed 
within-groups sum of squares is 


— l of these deviations 
om associated with the 


k 
(= 1) + (mg—-1) +--+ 4 (n, — 1)= У ny — L=N—k 

i 
The number of degrees of freedom for each group is л; — 1. Hence the 
number of degrees of freedom for А groups is En; — k, or N — k. The 
number of degrees of freedom associated with the between-groups sum of 
squares is А — 1. We һауе k means, and 1 degree of freedom is lost by 


[15.2] 


[15.3] 


[15.4] 


[15.5] 


ANALYSIS OF VARIANCE: ONE-WAY CLASSIFICATION 213 


expressing the group means as deviations from the grand mean. The 


degrees of freedom are additive: 
м-1= (0-0 (6-1) қ 
total within between 


ares are divided by their 


The within- and between-groups sums of squi 
groups variance estimate 


associated degrees of freedom to obtain a within- 


s,2 and a between-groups variance estimate 52. Thus 


additive. The variance 


es and degrees of freedom are 
poken of 


The sums of squar 
dditive. The variance estimate is sometimes $ 


estimates аге not ај 
as the mean square. 


THE MEANING OF THE VARIANCE ESTIMATES 
What meaning attaches to the 
assume that the k samples аге drawn from populations having the same 
е assumption is duni сее" 25 ар = c. If this 
the expected value of s? is 0°; that із, E(sy2) =”. 


sed estimate of the population variance. It is an 
for the k samples. It may be 


variance estimates Su? and 52? Let us 


variance. Th 
umption is tenable, 
is ап unbia 
d by combining the data 


ass 

Thus 547 

estimate obtaine| 

written in the form 

n т na v = 

p» (X, XY * 2, (Xe pe sete “+> (Xin — Xx)? 
ігі. 


1 
nni 


the Е test to determine the signifi- 


between two means for independent samples, an 


unbiased estimate of the population variance was obtained by combining | 


the sums of squares about the means of the two samples and dividing this 
by the total num om. The within-group variance s," is 


an estimate of рге e. It is obtained by adding together 


the sums of square nd dividing this by the total 
number of degrees of freedom. The v in the ¢ test is 
2 which occu 


the particular case of Sw 
The expected value of sy may be sho 


к кз 
(eae м, М 
> (ш е и 


PELLE TS UM 


The reader will recall that in applying 


cance of the differences 


e k sample means à 
ariance es 
rs when k = 2. 
wn to be 


s about th 
timate used 


214 


15.6 


THE DESIGN OF EXPERIMENTS 


where шм; and и are population means. Under the null hypothesis Ш = рь = 
CC = к = p, and the second term to the right of the above expression is 
equal to zero. Hence under the null hypothesis both S," and 5,2 are 
estimates of the population variance 0%. 
That sj? is an estimate of о? under the null hypothesis may be illustrated 
by considering the particular situation where ny =n, =. - 


> =n, =n. The 
between-group variance estimate may then be written as 


к 5 
n Y (X,— X)? 
ӛзі 
k=1 


This is n times the variance of the k means, or nsz”. The error variance of 
the sampling distribution of the arithmetic mean for samples of size n is 
given by os? = о?/п. Hence no; = о?, The quantity nsz? is an estimate of 
пах”, hence also of o?, Thus 52 is an estimate of o?, 

Where the null hypothesis is false and the means of the populations from 
which the А samples are drawn differ one from another, the second term to 
the right of the expression for Е (5,2) is not equal to zero. It isa measure of 
the variation of the Separate population means ш 

To test the hypothesis Неш =m =. + 
This is ап F ratio. Under the null hypothesis the expected value of this 
is unity since (52) = Е(5„?) = o?. If the population means differ from 
each other, Е(5,2/5,2) will be greater than unity, If. 507/52 is found to be sig- 
nificantly greater than unity, this may be construed to be evidence for the 


from the grand mean А. 


use of computation formulas. To simplify the notati 
all the observations in the jth group by 7. Thus 


The computation formulas are readi 
sum of squares is 


[15.6] 


[15.7] 


[15.8] 


[15.9] 


[15.10] 


15.7 


2 Divide these sums 


3 Calculate the F ratio 52/5: 


4 Ише probabilit 


, Я 
ANALYSIS ОЕ VARIANCE: ONE-WAY CLASSIFICATION 21 
% 
A 


ЎЎ 0-07 Ў -N 


йер J1 i=1 


Thus we find the sum of s ; 
hus quares of all observations and z : 
within-groups sum of squares is слу - 


ўў о 50 
y Aj = M же 
j=1 i=1 j=1 i=1 3 z (4) 
is the square of the sum of the jth group divided by den 


The quantity Т/л; 
alculated and n: 


number of cases in that group. These values are c 
over the k groups. The between-groups sum of squares is 


| | 4 
£ux-m-5(5-5 : ў 


ізі ігі 


erally applicable to groups of unequal or equ 
here the groups are of equal size and n, =ne 
f squares may be written а 


The above formulas are gen 
size. In the particular case w - 


= п, = n, the within-groups sum 0 


2 е пр 
k 

k by Tj 

ООР. rcm 


and the between-groups sum of squares becomes 


ET т 
© 5 ns 
п N 
SUMMARY 
s hitherto discussed. 


ry form the formula: 
cance of the difference between 
5 are involved: 


nts in summa 
test the signifi k means 


the following step 


Table 15.1 prese 
In summary, to 
using the analysis of variance, 
] Partition the total sum of squares into two components, à within-groups. 
and a between-groups sum of squares, using the appropriate computa 
tion formulas. X ^ 
ociated number of degrees of | 

: 


of squares by the ass 
and between-groups variance | 


freedom to obtain s,? and 5,2, the within- 
his to the table of F (Table D of 


estimates. 
? and refer t 


the Appendix). 
g the observed F value is small, say, less 


y of obtainin; 
null hypothesis, reject that hypothesis 


than .05 or .01, under the 


| № 


le} taf 


(Y —"X) << 


"оз 


I= gef 


a их < © 
“Хх = 
E 


fu tef I=} Ief 
(5) 5-25 
г Y “ 
20 = (759) 
3-М a itis 
Te] taf - 
iX —"Х) Ke S 
ту 
Y-N 
Is! tof 


(их) ХХ 
v 


N u 
a ES t= 
i i 
3 
N aia 
z y 
LX € (5) 
ltu Z-n] ат) < 
У y 
i 
ux - av X 
iQ -— x 


sdno13 [enbo 
вв|пшіо) иоцепашо’) 
sv[nuuj иоцеи 05) 


uonejoadxq 


arenbs ирәш 


тәтешцвә эзивые A 
шорәәлу Jo sao1do( 


So1enbs jo wing 


^ Ё Y 
poy чид 


uvaniag 


иона fo 334n0g 


su[nurioj jo Хдишшпв uonvogrssu[o дем-эцо :әзитава JO sisájeuy Ы 


USI эрдед, 


16 


E-WAY CLASSIFICATION 217 


ANALYSIS OF VARIANCE: ON 


15.8 ILLUSTRATIVE EXAMPLE: ONE-WAY CLASSIFICATION ' 


Table 15.2 shows the number of nonsense syllables recalled by four groups 
of subjects using four different methods of presentation, Fictitious data are | 
used here for simplicity of illustration. The sums of squares have been 
calculated using the computation formulas. The data are presented in 
summary form in Table 15.3. The number of groups is 4. The number of 
degrees of freedom associated with the between-groups sum of squares is | 
k-1=4-1=3. The number of degrees of freedom associated with the 
within-groups sum of squares is N — k= 26— 4= 22. The number of 
degrees of freedom associated with the total is М-1=26—1= 25. The | 
between and within sums of squares are divided by the associated degrees 
of freedom to obtain the variance estimates 52 and Su ^ 
The F ratio is ҒЫ А дені 21.41/3.91 = 7.01. Consulting a table of F with 
df=3 associated with the numerator and df= 22 with the denominator, 


one-way classification. Number of поп- 


f variance; 
thods of presentation 


Table 15.2 Computation for the analysis o 
sense syllables correctly recalled under four me! 


Method 
5 9 8 1 
1 n 6 3 
6 8 9 4 i 
3 5 5 
9 7 i 
7 4 4 
4 4 - 
2 
-26 
nj 8 5 7 6 25 46 
1; È M p r dn 9.85 
Y 6.14 3.00 T?/N = 819. 
x, 5.38 8.40 ""- 
m = 
У Хи 269 364 287 68 У > 
ізі к T 
264.14 54.00 Da 902.07 


Between 902.07 — 819.85 = 82.22 
— 902.07 = 85.93 


Within 
— 819.85 = 168.15 


218 


Table 15.3 


15.9 


у [15.11] 


THE DESIGN OF EXPERIMENTS 


Analysis of variance for data of Table 15.2 


` Source of Sum of Degrees of Variance 
variation squares freedom estimate 
EI ee 
Between 82.22 3 27.41 = 52 
Within 85.93 22 3.91 = s„? 
Total 168.15 25 F=7.01 


ее e UL E bd 


we find that the value of F required for significance at the .01 level is 4.82. 
We may safely conclude that the method of presentation affects the 
number of nonsense syllables recalled. 


THE ANALYSIS OF VARIANCE WITH TWO GROUPS 


With two groups only, the significance of the differences between means 
may be tested using either a t test or the analysis of variance. These 
procedures lead to the same result. Where 4 = 2 it may be readily shown 
that VF = |. 

Consider a situation where 4 = 2andn,—n,— п. Under these circum- 
stances the between-groups variance estimate 5, is 

n n5 - Xy 4 n(¥, Ж) 
zy 2—1 
For groups of equal size the grand mean X is halfway between the two 
group means X, and X, Thus (X, — X)- (X,— X) = 3(X, — Х,) and 
(5 —X)-—(Y,— ЖИ — (X, — Х,)*. We may therefore write 


sy =F (X, - X,)? 


When k=2 the within-groups variance estimate 5,2 is the unbiased 
variance estimate s?, obtained by adding the two sums of squares about the 
means of the two samples and dividing by the total number of degrees of 
freedom (Sec. 15.5). Hence 

= (Х, = X 
FoU) 
and 

кә. S 

sV (1/n) + (1/n) 

Thus VF =t and Р=Р. To illustrate, let n, = n, = 8. In applying the 
analysis of variance with df — 1 associated with the numerator and df = 14 
associated with the denominator of the F ratio, an F of 4.60 is required for 


significance at the .05 level. The corresponding t for df= 14 required for 
significance at the .05 level is V4.60 =2. 145. The t test may be considered 


15.10 ASSUMPTIO! 


ААА шаласы 


"РҮ VT EST RIPE, 
$ OF VARIANCE: ONE-WAY CLASSIFICATION 219. 


particular case which arises when - 


4 


ANALYSE 


ar case of the F test. It is a 


a particul 
k=2. 

In the above disc 
The result МЕ = tis, however, 


two groups of equal size. | 
quite general and holds when n, and n; are 7 
fi 


roups the algebraic development is a bit more cum- 
en here. The grand mean does not fall midway 


ussion we have considered 


unequal. For unequal g 
bersome than that giv 
between the two group means. 

3 


E ANALYSIS ФЕ VARIANCE 


he analysis of variance а number of 
be raised about the nature of these 
data to satisfy them. 


NS UNDERLYING TH 
al development oft 


assumptions are made. Questions may 
assumptions and the extent to which the failure of the 
leads to the drawing of invalid inferences. і 
One assumption is that the distribution of the dependent variable in the | 
population from which the samples are drawn is normal. For large samples | 
of goodness of 


the normality of the distribution may be tested using а test 
fit, although in practice this is rarely done- When the samples are fairly | 


small, it is usually not possible to rigorously demonstrate lack of normality 
in the data. Unless there is reason to suspect à fairly extreme departure 
from normality, it is probable that the conclusions drawn from the data 
using an F test will not be seriously affected. In general, the effect of a 
departure from normality is to make the results appear somewhat more sig- 
nificant than they are. Consequently, where a fairly gross departure from 
normality occurs, a somewhat more rigorous level of confidence than usual | 
may be employed. 

А furthur assumption in the application of the analysis of variance is that 
the variances in the populations from which the samples are drawn are 
equal. This is known as homogeneity o iance. А variety of tests о 
homogeneity 9 lied. These аге discussed in mor 
advanced texts rate departures from homogeneity 

om the data. Gross 


should not seri e inferences drawn fr 
from homogene! hich are seriously in 


In the mathematic 


(Winer, 
ously affect th 
ity may 
tances atu 


f the variable, which 


rocedure. 
jous factors on the total 


cative. The basic 
n observation та 

bit resulting from 
unds to suspect 


stances it may 

A further assumption is 

variation are additive. аз 

model underlying the analysis of v: 

d into independent ап 

. In most situation 

f this model. Е: 
jons underlying the analysis of 

. The raw data of Umen 

which the mathematical 
f variance is that reason- 

d homogeneity may 


that the е 
distinct from, 
ariance is 
4 additive bits. each 
s there аге по 210) 


Опе advantage of t 
m the assumption: 


220 


15.11 


THE DESIGN OF EXPERIMENTS 


occur without seriously affecting the validity of the inferences drawn from 
the data. 


TRANSFORMATION OF DATA 


As indicated in Sec. 15.10 the appropriate use of the analysis of variance 
involves assumptions of homogeneity of variance, normality, and addi- 
tivity. With some data these assumptions are obviously not satisfied. Some- 
times it is possible to use a simple transformation, resulting in a set of 
transformed values which conform more closely to one or more of the 
assumptions which the appropriate use of the analysis requires. Most 
transformations are intended to bring the variances closer to equality. In 
many cases the transformed values will approximate the normal form more 
closely than do the original observations. 

Commonly used transformations are the square root transformation, 
VX, the logarithmic transformation log X, the reciprocal transformation, 
1/X, and the arc sine transformation, агсзт VX. A decision regarding 
which transformation is appropriate is made by exploring the relation 
between the variances and the treatment means. The nature of this relation 
determines which transformation to use. 

The square root transformation replaces each of the original observa- 
tions by its square root. This transformation is appropriate when the 
variances are proportional to the means; that is, when se/X, — c, where c 
is a constant. Of course with real data this relation will be only roughly 
approximated. 

Table 15.4 shows an example of a square root transformation. The origi- 
nal data are shown on the left. Note that the variances differ markedly one 
from another. Note also that the variances are roughly proportional to the 
means. The ratio s?/X, is in the neighborhood of 1.25. Transformed values 
are shown at the right in Table 15.4. Note that the effect of the transforma- 
tion is to reduce the disparity between the variances. The variances for the 
four groups are now .34, .29, -30, .34, and the assumption of homogeneity of 
variance is approximately satisfied in the transformed data, 

With data containing small numbers, and with zeros, it is advisable to 
add .5 to each value before taking the square root. The transformation 
becomes V X + .5, 

When the variances are proportional, not to the means, but to the square 
of the means, a logarithmic transformation, log X, is appropriate. Here 
again, of course, with real data the relation s?/X?=c may be only 
roughly satisfied. If some of the observations are zero and small numbers, 
the suggestion is that the transformation log (X + 1) be used. When the 
standard deviations, and not the variances, are proportional to the square 
of the treatment means, that is, when ИХ = с, а reciprocal transforma- 
tion, 1/X, may prove useful. 

In some experiments the observations obtained are proportions or per- 
centages, such as the proportion of successful responses in a given number 
of trials, or the percentage of correct answers in performing a series of 
tasks. When the data are proportions, a relation between means and 


ғ Ї 
Р > 


ANALYSIS OF VARIANCE: ONE-WAY CLASSIFICATION 221 - 


Table 15.4 Comparison of original and transformed data using a square root transformation 
Original data, X Transformed data, VX | 


1 H HI IV £ п ІШ ІР 
аш Ша 3 

1 5 14 2 10 23 ~ 344 14 

5 9 18 1 2234 3.00 424 265 я 

2 5 9 3 141 2.24 300 1.73 

6 n 21 10 245 332 458 ЗЛ 

1 12 20 7 25 346 т 26 

3 5 12 5 1123 221 346 2.24 

3 8 16 6 123 283 400 245. 


27 55 110 40 13.21 19.33 27.49 16.29 
x 3.86 7.86 15.71 5.71 1.89 2.76 3.93 2:337 
d 4.18 8.76 19.06 7.30 34 29 30 за! 


e kind s? = x, = Х,) may be observed. Under 

transformation, arcsin МХ, may be used. 

an angle whose sine is the . 
4 


variances roughly of th 
this circumstance an arc sine 
Each of the original observations is replaced by 
square root of the original observation. A useful table of angles corre- 


sponding to percentages, that is, arcsin V percentage , has been prepared 
by C. I. Bliss and is given in Snedecor and Cochran (1967). 
N 


How many degrees of freedom are associated with the variation in the 1 
data for (а) a comparison of two means for independent samples, each |. 
containing 20 cases, (b)a comparison of four means for independent [ 
samples, each containing 14 cases, (c) a comparison of four means for | 
independent samples of size 10, 16, 18, and 11, respectively? 


nts obtained for five equal groups of sub- 


- 


EXERCISES 


2 The following are measureme 
J^ 


jects: 
Group Measurements X; 
1 a 7 9 9 14 8.60 
II о 6 1212 7 8.40 | 
ш 15 18 21 26 20 20.00 
IV 35 27 29 30, 25 29.20 
у 17 26 И 20 12 18.40 ( 


Apply the analysis of variance to test the null hypothesis 


Ни = Ba = Ha = Ha = №5 


222 THE DESIGN OF EXPERIMENTS 


3 The following are error scores on a psychomotor test for four groups of 
subjects tested under four experimental conditions: 


Group Error scores Xx, 
——$—$— 6. 
I 16 7 19 2 31 19.40 
п 2 6 15 25 32 24 29 22.14 
ш - 16 15 18 19 6 13 18 15.00 
IV 25 19 16 17 42 45 27.33 
nuc о —— — РИА 


Apply the analysis of variance to test the null hypothesis 
Hoa = po = из = pa 


4 Apply the analysis of variance 10 test the significance of the difference 
between means for the following data: 


I II ІП 

сес С ee 

п 10 10 10 

X, 7.40 8.30 10.56 

п 

Ух? 649 755 1,263 

ізі 

к=" eee 

5 The following are test scores for two groups: 

Group Scores mi 
I 12 26 31 12 14 16 10 17.29 
II 8 14 29 Z 14 6 13.00 


SSS SS И: Е, 1. 1500 


Calculate 5,2, 5°, and F, Test the null hypothesis Нуш = us. 


6 What assumptions underly the analysis of variance? 


7 Тһе following are experimental data for three independent groups: 


Group Data Xj 
Ес 077 тат ee 
II 49 121 144 169 


196 135.80 


121 70.80 


What would be an appropriate transformation on these data? Trans- 
form the data and apply an analysis of variance to the transformed 
values, calculating the required F ratio, 


мн? Qu мерусу, ТО ОСП ЗСТ 
б 92177 


ANALYSIS OF VARIANCE: 
TWO-WAY CLASSIFICATION 


16.1 INTRODUCTION 
Experiments may be designed to permit the simultaneous investigation | 
of two experimental variables. Such experiments involve two bases of. 
classification. To illustrate, assume that an investigator wishes to study - 
the effects of two methods of presenting nonsense syllables on recall after 
5 min, 1 hr, and 24 hr. One experimental variable is method of presenta- 
tion, the other the interval between presentation and recall. There are 
six combinations of experimental conditions. One method of conducting | 
such an experiment is to select a group of subjects and allocate these ato 
random to the experimental conditions, an equal number being assigned | 
to each condition. With, say, 10 subjects allocated to each experimental. 
condition, the total number of subjects will be 2 X 3 X 10 = 60. The data 
may be arranged in a table containing two rows and three columns. The 
rows correspond to methods, the columns to time intervals. The 10 obser- - 
vations for each group may be entered in each cell of the table. Differences 
in the means of the rows result from differences in recall under the two 
methods of presentation. Differences in the means of the columns resul 
from differences in recall after the three time intervals. , 

Experiments with two-way classification may be conducted with only 
one sampling unit, and measurement, for each experimental condition, | 
With one measurement for each experimental condition the total sum of 
squares is partitioned into three parts, a between-rows, a between-columns, 


and an interaction sum of squares. With more than one measurement for | 
each experimental condition, the total sum of squares is partitioned into 1 
four parts, a between-rows, a between-columns, an interaction sum of 


223 


224 


16.2 


THE DESIGN OF EXPERIMENTS 


squares, and a within-cells sum of squares. Each sum of squares has an 


associated number of degrees of freedom. By dividing the sums of squares 
by the associated degrees of freedom four variance estimates are obtained. 
These variance estimates are used to test the significance of the differences 
between row means, column means, and, with more than one measurement 
per cell, the interaction effect. 


The data may be represented as follows: 


-+, Rand c=], 2, 

- - С.А dot notation is used to identify means. The symbol Х|. refers 
to the mean of the first row, Y, the mean of the second row, and ра the 
mean of the rth row. Similarly, Х., refers to the mean of the first column. 
X. the mean of the second column, and X , the mean of the cth column. 
The grand mean, the mean of all N observations, is X... The total sum of 
squares of deviations about the grand mean is given by 


Consider now a situation where we haven sampling units and n measure- 
ments for each of the RC treatment combinations. The total number of 


16.3 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 225 


measurements is ПВС = N. Where К = 2, C —3, and n —3, the data may 
be represented as follows: 


Xa. 


Column 
mean 


Triple subscripts are used. The first subscript identifies the row, the 
second the column, and the third the measurement within the cell. Thus 


Хам mean 
the third column. In general X,,; denotes the measurement for the ith 


individual in the rth row and cth column, where i= 1, 2, . . . , n. Row, 
column, and cell means are identified by a dot notation. The mean of all 
the observations in the first row is Х|... The mean of all the observations 
in the rth row is X,... Similarly, the mean of the first column is Х and of 
the cth column X.. The mean of all the observations in the cell corre- 
sponding to the rth row and cth column is X... The mean of all nRC observa- 
tions, the grand mean, is X... The total sum of squares of all observations 


about the grand mean is 


R Cc n an 
$$. ES 


т=1 с=1 ізі 


The sum of squares of deviations about the grand mean, both where n= 1 
and n > 1, is partitioned into additive components. 


PARTITIONING THE SUM OF SQUARES 

With one measurement only for each of the RC treatment combinations, 
the total sum of squares may be partitioned into three additive compo- 
nents, a between-rows, a between-columns, and an interaction sum of 
squares. We proceed by writing the identity қ 


(х= X) = An X) + (Х.Х) + бк Х.-Х.+Х.) 


at the deviation of an observation from the grand 
sed of three parts, a deviation of the row 
deviation of the column mean from the 


This identity states th 
mean may be viewed as compo: 
mean from the grand mean, a 


s the measurement for the first individual in the second row and 


ТЕР тай” ^ 


226 


[16,1] 


16.4 


THE DESIGN OF EXPERIMENTS 


grand mean, and a remainder, or residual term, known as an interaction 
term. By squaring both sides of the above identity, an expression is ob- 
tained containing six terms. This may be summed over R rows and C 
columns. Three of these terms conveniently vanish, because they contain 
a sum of deviations about a mean, which, of course, is zero. The resulting 
total sum of squares may be written as 


УЕ, Рав Ў (Х.Х) 
r=1 с=1 r=1 сті 


R с 
+> У Я -Х,-Х.+Х.) 

r=1 c=1 
The first term to the right is C times the sum of squares of deviations of 
row means from the grand mean. This is the between-rows sum of squares. 
It describes the variation in row means. The second term is R times the 
sum of squares of deviations of column means from the grand mean. This 
is the between-columns sum of squares. It describes variation in column 
means. The third term is a residual, or interaction, sum of squares. The 
meaning of the interaction term is discussed in detail in Sec. 16.5. 

With n measurements for each of the RC treatment combinations the 
total sum of squares may be partitioned into four additive components. 
These are a between-rows, a between-columns, an interaction, and a 
within-cells sum of squares. In this Situation we write the identity 


(Xret—X...) = (ХХ) + (Xe. =) ir Є и 
+ (Xre, — Xp. — Xo +X.) + (х, Xu.) 
This expression may be squared and summed over rows, columns, and 


within cells. АП but four terms vanish, and the resulting total sum of 
squares may be written as 


SSS (Жы Sena Жарай Ў (Е.-Х) 
тті e, 


r=1 с=1 ісі 


$ 
> 
M= 
1М- 
я 
| 
M 
> 
JE 
^x 


VARIANCE ESTIMATES OR MEAN SQUARES 


With a single entry in each cell, n — 1 and RC — N. The number of degrees 
of freedom associated with the total sum of squares is RC —1—— 1, The 


Table 16.1 


В — Тапа С— 1, respectively. The number of degrees of freedom associa 
ted with the interaction sum of squares is (R — 1) (C — 1). The degrees of 
freedom are additive, and К 


У == = C= 1) FR =1) (6-10) 


total row column interaction 


The sums of squares are divided by the associated degrees of freedom to 
obtain three variance estimates, or mean squares. The between-rows, 
between-columns, and interaction variance estimates аге 5,7, 5.7, and 52, 


respectively. 
With n entries in each cell, where n > 1, the total number of observa- 


tions is nRC — N. The number of degrees of freedom associated with the 
total sum of squares is nRC — 1 — N — 1. The numbers of degrees of free- 
dom associated with row, column, and interaction sums of squares are | 
R—1,C— land (В — 1) (C — 1), respectively. The number of degrees of 
freedom associated with the within-cells sum of squares is nRC — RC — 
RC(n — 1). Because the deviations are taken about the cell means, 
degree of freedom is lost for each cell. In each cell n — 1 deviations аге 
free to vary. The number of degrees of freedom for RC cells, therefore, із | 
ЕС(п- 1). The degrees of freedom are additive. The sums of squares 
are divided by the associated degrees of freedom to obtain the variance - 


estimates, or mean squares. : 


Table 16.1 shows in summary form the sum of squares, degrees of 
freedom, and variance estimates for a two-way classification with n entries | 


per cell. 


Е ratios are formed from the variance estimates and used to test the | 


Analysis of variance for two-way classification with n entries per cell: n > 1 


Variance 
Source Sum of squares df estimate 
á ж ў 
Rows nC у (ХХ)? R-1 $4 
ізі “ 
с Ра ы 
Columns nR > (Xe. —X.,,)? C—1 52 


сті 


Interaction n $ * (X..—X.—X.-X. 9 (R-1(C-1) s 


r=1 e=1 


Ww m 1 
Within cells У У У (X,4—X4)* RC(n—1) Sb. б 


r=1 c-1 ізі 


L3 ind 
Total У S У)? nRC — 1 


rsi с-з іі 


pw 


228 


16.5 


THE DESIGN OF EXPERIMENTS 


significance of row, column, and, where n > 1, interaction effects. The 
correct procedure here, and the interpretation of the variance estimates. 
depends on the statistical model appropriate for the experiment. Three 
models may be identified: fixed, random, and mixed. The investigator 
must decide which model fits his experiment. This decision determines 
how the variance estimates are used in the application of tests of signifi- 
cance to the data. Before proceeding with a discussion of these models 
(Sec. 16.6), the meaning of the interaction term is discussed. 


THE NATURE OF INTERACTION 


The algebraic partitioning of sums of squares in a two-way classification, 
where n > 1, leads to the interaction term 


n c 
d$ 5 0.—X.-X,Xy 

r=1 с=1 
The nature of interaction may be illustrated by example. Consider a 
simple agricultural experiment with two varieties of wheat and two types 
of fertilizer. Assume that one variety of wheat has a higher yield than 
the other. If the yield-is uniformly higher regardless of which fertilizer is 
used, then there is no interaction between the two experimental varieties. 
If, however, one variety produces a relatively higher yield with one type 
of fertilizer than with the other, then the two variables may be said to 
interact. To illustrate further, assume that we have two methods of teach- 
ing arithmetic and two teachers. Each teacher uses the two methods on 
separate groups of pupils. The achievement of the pupils is measured, If 
one method of instruction is uniformly superior or inferior regardless of 
which teacher uses it, then there is no interaction between methods and 
teacher. If, however, one teacher obtains better results with one method 
than the other, and the opposite holds for the other teacher, then teachers 
and methods may be said to interact. 

Table 16.2 shows observed cell means for a two-way classification with 
three categories for each of the experimental variables. The observed 
cell entries are means based on an equal number of cases. What are the 
expected cell means on the assumption of zero interaction? This situation 
is somewhat analogous to the calculation of expected values for contin- 
gency tables. For a contingency table we calculate expected cell frequen- 
cies. Here we are required to calculate the expected cell means on the 
assumption that the two experimental variables function independently. 

Assuming zero interaction, certain constant differences will be main- 
tained between cell means. In Table 16.2 the observed row mean for A 
is 10 points less than the row mean for B. If the interaction were zero, we 
should expect a constant 10-point difference to occur between means for 
A and B under treatments I, II, and III. A similar relationship would be 
expected on comparing all other rows and columns of this table. Obviously, 
the observed values in Table 16.2 do not exhibit this characteristic. The 
interaction is not zero. 

Where the interaction is zero, a deviation of a cell mean from the mean 


Table 16.2 


16.6 


[16.2] 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 229 


Comparison of observed cell means and means expected under 


zero interaction 


Observed. Х,. Expected, E(X,. ) 
I и ош I п ош 
10 A| 4 16 10 | 10 
20 B| 14 26 20 | 20 
60 C| 54 66 60 | 60 
30 24 36 30 30 


of the row (or column) to which it belongs will be equal to the deviation of. 
its column (ог row) mean from the grand mean. If X, is а cell mean and - 
Х,.. and X.. are its corresponding row and column means, then under zero 
interaction, X,,, — X, = X, — X... Thus the expected value of X,+, under 
zero interaction is given by E(X,,) =Х,. + X, — X.. These expected 
values have been calculated for the observed data of Table 16.2 and are 
shown to the right of the table. On comparing the expected values in any 
two rows or columns, note the constant increment or decrement. If X,, is 
an observed and E(X,..) an expected value, the deviation of an observed 
from an expected value is X, — X, — X. -- X... The interaction term in the 
analysis of variance is n times the sum of squares of deviations of the 
observed cell means from the expected cell means. 


FINITE, RANDOM, FIXED. AND MIXED MODELS 


Different authors recommend different procedures for testing row, column, 
and interaction effects in a two-way analysis of variance. Difficulties 
associated with the selection of the appropriate procedure are resolved 
by the recognition of a general statistical model underlying the analysis of 
variance. This model is referred to here as the finite model. Three particu- 
lar cases of the finite model may be identified. These are the random, fixed, 
and mixed models. The models appropriate for different experiments 
differ. The investigator must decide which model best represents his 
experiment. The choice of model determines the procedure for testing 
row, column, and interaction effects. The choice of model depends on the 
nature of the variables used as the basis of classification in the experi- 
mental design. 

The general finite model makes the linearity assumption that a deviation 
of an observation X,«; from the population value of the grand mean ш may 
be expressed in the form 


Ara — = A, be (ab) re e 


The four quantities to the right are in deviation form. Thus а= y, 
т.. ‚а 


230 


_ Table 16.3 


THE DESIGN OF EXPERIMENTS 


deviation of the population value of the row mean from the grand mean д. 
Similarly, b. = р... — p, a deviation of a column mean from the grand mean. 
The interaction term (ab),. = (Hre. — Hr.. — м. + м), and the error term 
ега = Xrci — Мес. Where this model is used to represent experimental data 
the implicit assumption is made that treatment effects can meaningfully 
be partitioned into-additive components for each sampling unit. Because 
ar, бе, (ab) с, and ej; аге in deviation form, they sum to zero. The popula- 
tion variances of the four components are 0,7, су, Cap, and с,2. 
The null hypothesis under test, for example, for row effects is Наш. 
be - = ре... This hypothesis may be stated in the form Но? 
Similarly, the null hypotheses for column and interaction effects may be 
stated as Н,:0у° = 0 and H,¢:0q7 = 0. We wish to obtain from the experi- 
mental data information which will provide a valid test of these hypotheses. 
We now consider an actual experiment involving R levels of one variable 
and C levels of another. The R and C levels may be regarded as samples 
drawn at random from two populations of levels comprised of R, and C, 
members, respectively. Thus we conceptualize two populations of levels. 
The levels used in a particular experiment are construed to be drawn at 
random from these two populations. R,, R, С», and C may take any integral 
values, provided, of course, that R < R, and C < C,. The RC treatment 
combinations are assigned at random to the nRC sampling units or indivi- 
duals. Under these conditions, and given the basic linearity assumption, 
Wilk and Kempthorne (1955) have shown that the expectations of the mean 
squares for the general finite model are as shown in Table 16.3. 


Expectation of mean squares for general finite 
model: two-way analysis of variance with n entries 
in each cell: n > 1 


Mean Expectation of. 
square mean square 
(6,—C 1 
Row, 5,2 с? + Теле now, + nCo,? 
» 
è R,—R р 
Column, 5,2 c, + гю пса + nRo,2 
р 
Interaction, s; oe + поса? 


Within cells, 5,2 ог 
— — А. 


Thus the mean squares provide estimates of variance components, 
and these are used to test the significance of row, column, and interaction 
effects. How these are used depends on a consideration of three particular 
instances of the general finite model. { 


Consider an experiment involving R levels of one variable апа С of 


ae 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 231 i 


another, these being regarded as random samples of levels from popu- 
lations comprised of R, and C, members. We may consider a case where | 
Кр and C, are very large, so that Rp >> Rand C, >> C, where >> denotes 
much greater than. Under these circumstances such terms as (R,— R)/R,, 
and (C, — C)/C, approach unity. When this is so we have what is referred | 
to as a random-model situation. The expectations for the random model are 
obtained by substituting (А, — R)/R, — 1 and (C, — C)/C, — 1 in the expec- 
tations of the mean squares for the general finite model given in Table 16.3. 
Thus the random model is a particular case of the finite model. | 
In psychological research, experiments where the random model is 
appropriate are not numerous. Satisfactory examples are not readily | 
found. One example is an experiment where each member of a sample of 
R job applicants is assigned a rating by each member of a sample of C inter- _ 
viewers. Here both job applicants and interviewers may be viewed as 
samples drawn at random from populations such that А, >> К and | 
Сыз» б. ; 
In many experiments the R levels of one variable and the C levels ofthe | 
other are not conceptualized as random samples. In agricultural experi- 
ments where R varieties of wheat and C varieties of fertilizer are used, the 
investigator is usually concerned with the yield of particular wheat varie- 
ties and with the effect of particular fertilizers on yield. He is not concerned * 
with drawing inferences about hypothetical populations of wheat and. 
fertilizer varieties. Both variables or factors are fixed. Any factor is fixed 
if the investigator on repeating the experiment would use the same levels 
of it. Under the fixed model R = R, and С = С,. By substituting (А, —R)/_ 
R,=0 and (C, — C)/C, = 0 in the expectations of the mean squares for the | 
finite model given in Table 16.3, the expectations for the fixed model are | 
obtained. д 
In psychological experiments different methods of learning, environ- “ 
1 


mental conditions, methods of inducing stress, and the like, аге examples 

of fixed factors or variables. In many experiments different levels of the | 
experimental variable are introduced, e.g., levels of illumination, time 
intervals, size of brain lesion, and dosages of a drug. While the levels may 

be thought to constitute a representative set and interpolation between 5 
levels may be possible, such variables are usually regarded as fixed. ОЁ 
course it is possible to conceptualize a study where, for example, levels of - ` 
illumination or dosages of a drug are sampled at random from a popula- 
tion of levels or dosages. Ordinarily, however, experiments are not de- 
signed in this way. 

In many experiments one basis of classification is a random factor or- р 
variable and the other is fixed. Measurements may be obtained юга | 
sample of R individuals for each of C treatments or experimental condi- 
tions. Here one basis of classification is random and the other is fixed. 
This is a mixed model. In the mixed-model situation either R, = R and 
C, >> С or В, >> К and С, = С. By substituting (К, — R)/R, = 1 and 
(C, — C)/C, = 0, or vice versa, in the expectations for the finite mode] of 
Table 16.3, we obtain the expectations for the mixed model. 


232 THE DESIGN OF EXPERIMENTS 


Table 16.3 may be used to provide the required expectations for a two- 
way classification where n = 1. Under this circumstance no within-cells 
variance estimate s,? is available. The expectations for row, column, and 
interaction effects for the random, fixed, and mixed models are obtained 
by writing п = 1 and substituting the appropriate values of (А, — R)/R, 
and (C, — С)/С,. 


16.7 CHOICE OF ERROR TERM 


By choice of error term is meant the selection of the appropriate variance 
estimate for the denominator of the F ratio in testing row, column, and 
interaction effects. In general, in forming an F ratio, the expectation of 
the variance estimate in the numerator should contain one term more than 
the expectation of the variance estimate in the denominator, the additional 
term involving the effect under test. On applying this principle to the 
expectations of Table 16.3, the following rules may be formulated: 


1 Random model: n > 1 The proper error term for testing the interaction ef- 
fect is 5,7. F; = 52/5,2. The correct error term for testing row and column 
effects is 52. Е, = 5,2/52 and F. = 5,250, 


2 Fixed model: n > 1 Тһе proper error term is s,? for interaction, row, and 
column effects. The three F ratios are F; = 52/512, Е. = 5,2/5,2, апа Е. = 
selsy?. 


w 


Mixed model: n > ] The proper error term for testing the interaction effect 
is 51°. Р = 55,2. When R is random and C is fixed, the proper error term 
for testing row effects is 5,2, F, = 5,2/5,2. The proper term for testing col- 
umn effects із 52. Fe = 52/52, When R is fixed and C is random, the con- 
verse procedure applies. F, = s,?/s? and Ё, = s2/sy2. 


4 Randommodel:n =] No 57 is available. The correct error term for testing 
both row and column effects is s?. F, = 5,752 and Ее = s2/s?. 


5 Fixed model: n=] The point of view may be adopted that no test of either 
row or columns effects can be made. This point of view requires some mod- 
ification. The ratio s,?/s? is an estimate of (o + Co?) (o? + сш?) and 
will, where сш? > 0, be an underestimate of (o? + Co,?)/o2. This means 
that if a significant result is obtained, the investigator knows a fortiori that 
the effect tested is significant. If the result is not significant, the probability ` 
of accepting the null hypothesis, Hy:o,2 = 0, when it is false, may be high. 
Thus in the absence of significance no conclusions should be drawn from 
the data. 


6 Mixed model: n=] When К is random and С is fixed, the situation per- 
taining to the testing of row effects is as described above for the fixed 


Row 


Column 


Interaction 


16.8 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 233 


model, п = 1. The proper error term for the column effect is 5. Fe = 52/52 
When С is random and В is fixed, the argument relating to the fixed model, 
п = 1, again applies. The proper error term for the row effect is sê. Е, = 


afa. 


The above rules, excluding the modifications of rules 5 and 6 above, 
can be very simply obtained by using the following schema for the proper 
choice of error term: 


C. Sj Tow 
E uo. os 
Кес sig sw 


For the random model, (С, — C)/C, = 1 and С/С, = 0. The proper error - 


term for row and column effects is s. Similarly, the proper error term for 
the fixed and mixed models may be obtained. When n = 1, all terms con- 


taining Sw? vanish. For the random model, s? becomes the correct error | 


term for row and column effects. For the fixed model, no tests are possible. 
When rows are random and the columns are fixed, the column effect may 
be tested, but not the row. 


POOLING SUMS OF SQUARES: n > 1 
Under certain circumstances the within-cells and interaction sums of 


squares may be added together and divided by the combined degrees of 


freedom to obtain an estimate of variance based on a larger number 
of degrees of freedom. Caution should be exercised in applying this 


procedure. 
For the fixed model, the within-cells variance estimate is the proper 


error term for testing interaction, row, and column effects. For the random 


model, the interaction variance estimate is the proper error term for 
testing row and column effects. These procedures are always correct. 
For both models, when the interaction is quite clearly not significant, the 
within-cells and interaction sums of squares may be pooled to obtain a 
variance estimate for the denominator of the F ratio based on a larger 
number of degrees of freedom. Of course, when row and column effects 
are clearly significant, when tested without pooling, the pooling procedure 
is unnecessary. 

When doubt exists as to the significance of the interaction, the investi- 
gator may or may not choose to pool the sums of squares. If the interaction 


234 


‚ 16.9 


Rows 


Columns 


Interaction 


Total 


THE DESIGN OF EXPERIMENTS 


effect in fact exists, сш? being greater than zero, and terms are pooled, 
the pooling may be said to be erroneous. 

For the fixed model, erroneous pooling will increase the size of the error 
term. For the random model, erroneous pooling will decrease the size of 
the error term. In both instances the number of degrees of freedom is 
increased. Erroneous pooling will for the fi 
few significant results and for the random 
results. 

For the mixed model, when rows are random and columns are fixed, 
pooling may be applied with nonsignificant interaction. In this situation 
erroneous pooling will tend to make the error term too large for testing 
row effects and too small for testing column effects, leading to too few 
significant effects for rows and too many for columns. 

An understanding of the consequences of pooling sums of squares for 
fixed, mixed, and random models, when interaction does exist, that is, 
when Са? > 0, may be obtained by examination of the expectations of 
the variance estimates given in Table 16.3. Quite clearly, for the fixed 
model, when o? > 0, combining interaction and within cells will lead to 
an error term whose expectation is greater than 0^. Consequently, too 
few significant results will be obtained. 

In general, it is probably advisable not to pool unless the investigator 
is quite confident that the interaction is not significant. For a detailed 
discussion of this rather troublesome problem, see Binder (1955). 


xed model usually lead to too 
model to too many significant 


COMPUTATION FORMULAS FOR SUMS OF SQUARES 


Computation formulas are used to calculate the required sums of squares, 


A simplified notation is used. Denote the sum of all observations in the 
rth row by T,., the sum of all observations in the cth column by Т. the 
sum of all observations in the cell corresponding to the rth row and cth 
column by Т, and the sum of all М observations Бу Г. 

With one entry in each c 


ell, the computation formulas for sums of 
squares are as follows: 


R 
dos 


г=1 


1 T 
[16.3] с м 
1 с Т? 

[16.4] R S Те -— = 


с=1 


N , 


ЕТЕ 2 1 H 1 c Т: 
Des! Аса I RD, а 
тті 


r=1 с-і сті N 


R È 
16.6] У У X2— 
т=1 с=1 


zN 


Rows 


Columns 


Interaction 


Within cells 


Total 


16.10 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 235. 


The interaction sum of squares may be obtained by adding row and 
column sums and subtracting this from the total sum of squares. This 
provides no check on the accuracy of the calculation; consequently it is 
preferable to compute the interaction term directly. 


Computation formulas for sums of squares with n entries in each cell | 


are as follows: 


[16.7] + т.217 


n r=1 
UE TRUE 
[16.8] ак È Té N 
p. € қ Lx Ж 1:6 3 T? 
[16.9] = TE De. uL Ap 
n = > nc 2 За AR 2 £" N d 
к е n Ж 1 mC > 
06.10 У У У Хе — У у Т 
r=1 c=1 ізі ret cel 


R С п 2 
nei] У У УХ? -7 


ТЕТ сті ізі 


Here again the interaction sums of squares may be obtained by subtract» 
ing the row, column, and within-cells sums of squares from the tot: 
although direct calculation of the interaction term is preferable. 

The reader should note that the analysis of variance for two-way classi- 
fication with a single entry in each cell is a particular case of the more 
general case with more than one entry in each cell. When n= 1, formulas 
for the latter case become the formulas for the former. 


ILLUSTRATIVE EXAMPLE OF TWO-WAY 
CLASSIFICATION: n > 1 


study the effects of two variables on measures of performance of rats in 
a maze test. Three strains of rats were used, bright, mixed, and dull. A 
group from each strain was reared under free and restricted environmental 
conditions. Thus there are six groups of experimental animals with eight 
animals in each group. The total М is 48. The data are arranged in a 2 X 3 


* 

| 

Table 16.4 shows data obtained in an animal experiment designed to | 
4 


table with eight observations in each of the six cells. The row means permit | 


a comparison of environments, and the column means a comparison of 
strains. Table 16.5 shows the sums, means, and sum of squares of row, 
column, and cell totals. The sum of squares for all the observations is also 


given. 


1 
ч 


236 


Table 16.4 


Columns 


— Within cells 


Interaction 


Total 


THE DESIGN OF EXPERIMENTS 


Data for the analysis of variance with two-way 
classification: n > l error scores for three strains 
of rats reared under two environmental condi- 


tions 


Strain 
Environ- 
ment Bright Mixed Dull 
26 14 41 8 36 87 
4l 16 26 86 39 99 
Free 
28 29 19 45 59 126 
92 31 59 37 27 104 
51 35 39 114 42 133 
Restricted 96 36 104 92 оз 124 
97 28 130 87 156 68 
22 76 122 64 144 142 
ss * шщ 


Applying the computation formulas, the calculations are as follows: 


1 ра T? _ 5,044,897 (3,343) Ж 
ne 2 f m UB = 14,875.52 
тур: T? 4,015,617 (3,343)? Y 
aR 2; 7.2 = 16 748 = 18,150.04 
R T n R с 
У Dd Xet- S $ тз 309,851 2102469 
r=1 c=1 ісі n r=1 с=1 8 
— 42,667.38 
1 R R Є. T: 
- Т,2--- -2--- Т2--- 
п 2 2 = Ш à nR x N 
= 237469 5,944837 _ 4,015,617 (3,343)? 
8 24 is ^ ag 
— 1,332.04 
a C mn Т? (3. 343)? 
Хый- — = 309,851 -= = 77,024. 
2 p 2 "ё 7у 48 024.98 


The analysis-of-variance table for th 
The df for rows is R -1=2—1]=1, for columns C — ] =3 — 1= 2, for 


and for within cells 
RC(n — 1) =2 х 3(8—1) = 42. These sum to the total sum of squares 


model is appropriate 


Table 16.5 


Table 16.6 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 237 


Computation for data of Table 16.4 
Strain 
Environment Bright Mixed Dull Total 
ee T= 277 Т» = 395 Ты = 577 Т, = 1,249 
Хи = 34.63 X =49.38  X,— 72.13 X,, = 52.04 
ЗЫ Ти = 441 Т = 752 Т» = 901 Т.. = 2,094 
Xa = 55.13 Хз = 112.63 
Total Ta = 718 Ts = 1,478 F 
X, = 44.88 6 X5 = 92.38 aN 
тео —- a ЕЕ 
У 7,2 = 5,944,837 У Та = 4,015,617 
г=1 с-і 
8 с RC un 
У У Me? = 2,137,469 У У У Xe = 309,851 


1 те c-1 ізі 


- 


с. 


and s,? is the proper error term for testing row, column, and interaction 
effects. For interaction we have 


? _ 666.02 


=x 090 


This is less than unity. The expectation on the basis of the null hypothesis 
is unity. The interaction is somewhat less than we would ordinarily expect 
under the null hypothesis. We may safely conclude that there is no signifi- 
cant interaction between the two experimental variables. For differences 


Analysis of variance for data of Table 16.4 


Source of Sum of Degrees of Variance 
variation squares freedom estimate 
Rows (environments) 14,875.52 1 14,875.52 = 5,2 
Columns (strains) 18,150.04 2 9,075.02 = s? 
Interaction 1,332.04 2 666.02 = 5° 
Within cells 42,667.38 42 1,015.89 = s,,* 
Total 77,024.98 47 


14.61 Е. = 2. = 8.93 


Sw 


238 


THE DESIGN OF EXPERIMENTS 
in environments we have 


s? _ 14,875.52 _ 
5,2 1,015.89 


F, 14.64 

with 1 df associated with the numerator and 42 df with the denominator. 
For these df the values required for significance at the 5 and 1 per cent 
levels are 4.07 and 7.27. We conclude that the different environments 
have affected the maze performance of the animals. For strains the re- 
quired ratio is Fe = s,*/s,,2 = 9,075.02/1,015.89 = 8.93 with 2 df associated 
with the numerator and 42 df with the denominator. Again, this difference 
is significant at well beyond the 1 per cent level, and the conclusion is that 
differences in strain affect maze performance. 


UNEQUAL NUMBERS IN THE SUBCLASSES 


Situations arise in educational and psychological research where the 
numbers of observations in the subclasses in a two-way analysis of variance 
are unequal. In animal experimentation in psychology, this situation may 
result from loss by death or accident of a number of animals during the 
conduct of the experiment. For the fixed model, if the cell frequencies do 
not depart significantly from either equality or proportionality, simple 
adjustments may be made to the data. Two methods will be briefly de- 
scribed: the method of expected equal frequencies and the method of 
expected proportionate frequencies. The treatment given here is based on 
the work of Fei Tsao (1946). These methods are also described by Bancroft 
(1968). 


In applying the method of expected equal frequencies the following steps 
are involved: 


1 Apply a x? criterion to determine whether the cell frequencies depart 
from equality. Denote the frequency in the cell corresponding to the 
rth row and cth column by n,e. The expected equal frequency is the 
average value of nj, or N/RC. Denote this by п. The required x? is 

к 


х= ў Ў (ng, — п)? 


m n 
with RC — 1 degrees of freedom. 


2 Ifthe cell frequencies do not depart significantly from equality at, say, 
the 1 per cent level, apply a simple adjustment to the sum and sum of 


squares for each cell by multiplying these values by п/п. Thus the ad- 
justed cell sum is 


and the adjusted cell sum of squares is 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 239 


. = T 
? 
" 


TOS xg 3 
Nre 1 E: 
Li 


be were there an equal number of cases Л in each cell. Note that this 
adjustment does not change the cell means or the row and column 
means. 


D 

A s 4 Я 
This adjustment estimates what the cell sum and sum of squares would { 
V 


D 
7 


3 Use the adjusted cell sums and sums of squares to obtain row апа | 
column totals and the total sum of squares. 


4 Proceed with the analysis of variance in the usual way, employing the | 
computation formulas given in Sec. 16.9. У 
К 


The method of expected equal frequencies is simple and may be usefully | 
applied where the numbers of observations in the cells do not differ very 
much. 4 

In situations where the numbers of observations іп the cells differ, but _ 
are roughly proportionate to the marginal totals, the method of expected 3 
proportionate frequencies is appropriate. This method requires the follow- ; 
ing steps: f 


1 Apply a x? criterion to determine whether the cell frequencies in the | 
rows and columns depart significantly from proportionality. Denote | 
the observed frequency in the cell corresponding to the rth row and - 
cth column by n,e and the marginal frequencies for rows and columns | 
by n,, and п.с, respectively. Denote the cell frequencies expected on | 
the assumption of proportionality by пт. The expected frequencies 

s 


are given by у 
т пуп, E 
re = ү У 5 


The procedure here is identical with that used in calculating expected 
cell frequencies for a contingency table given the restrictions of the 
marginal totals. The x? criterion is 


wow tre = Fire)? 
g= = 


with (К — 1) (C — 1) degrees of freedom. 


2 If the cell frequencies do not depart significantly from proportionality, — 
the sum and sum of squares for each cell are adjusted by multiplying 
them by A,e/nre- The adjusted cell sum is then 


and the adjusted cell sum of squares is 


240 


Rows 


Columns 


Within cells 


Interaction 


` Total 


THE DESIGN OF EXPERIMENTS 


S WU: 
ТІ, «i 2 
— Y Хе 


Nre 1 


This adjustment provides estimates of what the cell sums and sums of 
squares would be were the numbers in each cell proportional to the 
marginal totals. 


3 The required sums of squares for the analysis of variance are obtained, 
using the adjusted values, by applying the following formulas: 


g 
16.13] У s 


16.14 
; : т=1 с=1 \ TC ізі r=1 c=1 с 
R a үш 2 R (E C 5 T? 
re - que 
peasy Ж (ж) е) (м) н 
R € n LU E T? 
[16.16] > (кеу хы) м 


All T’s relate to adjusted values. The above formulas differ from those 
previously given in Sec. 16.9 only in that they make allowance for the 
fact that the numbers of cases in the subclasses are unequal. 


4 Proceed with the analysis of variance in the usual way. 


In the above procedure the within-cells sum of squares is based on the 
adjusted values. Arguments may be advanced for using the unadjusted 
values in calculating the within-cells sum of squares. For comment on 
this point see Gourlay (1955). 

Both the methods of expected equal and expected proportionate fre- 
quencies are in some degree approximate. Departure from equal n’s in 
the former method and from proportionality of n’s in the latter method 
will introduce some bias in the F test, the extent of the bias being related 
to the magnitude of the departures. By bias here is meant that the F test 
produces either a larger or smaller proportion of significant F ratios than 
is warranted by the F distribution. 

The methods of equal and proportionate frequencies are applicable 
to a substantial proportion of situations encountered in practice. When 
the frequencies differ markedly from proportionality, other methods may 
be applied. For a discussion of these, see Winer (1962) and Bancroft (1968). 

For the random model, bias is introduced in the F test despite the 
proportionality of the numbers in the subclasses. From a practical view- 


16.12 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 241 


point this is not an important consideration. Good examples of the random 
model with unequal n’s are difficult to find in educational and psychological 
research. Of more practical importance is the fact that for the mixed model 
F test bias is introduced when the cell frequencies are proportional, and 
experiments involving this model are not infrequent. The bias is positive, 
the F test producing a larger proportion of significant F ratios than the F 
distribution warrants. For a discussion of this problem the reader is 
directed to Gourlay (1955). 

In general, because of the complications associated with unequal 
frequencies, it is advisable, whenever possible, to design experiments with 
an equal number of cases in the subclasses, although for the fixed model 
proportionate numbers of cases in the subclasses will introduce no bias. 
The investigator will thereby avoid a number of inconvenient complexities. 


ONE-FACTOR EXPERIMENTS 
WITH REPEATED MEASUREMENTS 


Many experiments in psychology and education require the repeated 
measurement of the same subjects under a number of different conditions. | 
In such experiments it is sometimes said that each subject acts as his own 
control. The simplest experiment of this type would be one in which the 
same subjects are tested under two conditions, a control condition and an 
experimental condition. Not infrequently the same subjects are tested 
under five or six different experimental conditions. Sometimes, when the 
same subjects are tested under a number of different treatments, the 
order of the presentation of treatments to subjects is randomized independ- 
ently for each subject. The purpose of randomization in this case is to 
eliminate effects which might result from the order of the treatments. 3 
In other cases randomization is not appropriate because the different 
levels of the treatment variable have a natural order. This is the case 
where performance is measured at different time intervals, as, for example, 
in the study of changes in dark adaptation with time, or for different num- 
bers of trials in a simple learning experiment. Such experiments are 
called one-factor experiments with repeated measurements. 

The data resulting from such experiments may be represented as a 
table of numbers in which rows represent experimental subjects and col- 
umns represent treatments, that is, the representation of the data is the 
same as that for the two-way classification with one observation per cell. 
The analysis of such data involves nothing new. The data are analyzed as 
in the two-way classification case with one observation per cell. Three 
sums of squares result: sums of squares for subjects (rows), treatments 
(columns), and interaction. 

With such data, subjects constitute a random variable and treatments 
are usually viewed as fixed. The model is the mixed model for n= 1. The 
expectations of the mean squares are as follows: 


242 


16.13 


Table 16.7 


THE DESIGN OF EXPERIMENTS 


Mean squares Expectation of mean squares 
Subjects, 5,2 oc + Cog 
Treatments, s с? + са? + Коу 
Interactions, s? (gro 


The proper error term for testing differences between treatments is 
52, that is, Fe = 52/52. No unbiased test of individual differences between 
subjects is possible, unless it is assumed that the interaction term is zero. 
With nearly all sets of data this assumption is not warranted, because the 
performance of subjects under different pairs of treatments is correlated. 
Ordinarily in most experiments of this type individual differences between 
subjects is of limited interest anyway, because with most variables which 
are the object of study the investigator expects a priori substantial dif- 
ferences between subjects. 


ILLUSTRATIVE EXAMPLE OF ONE-FACTOR 
EXPERIMENT WITH REPEATED MEASUREMENTS 


Table 16.7 shows hypothetical data for a one-factor experiment with 
repeated measurements. Rows are individuals, and columns are treat- 


Data for the analysis of variance with two-way classification: n — 1, scores 


for a sample of subjects tested under four different conditions 


Conditions 
Subject А B c D T. 
1 31 42 14 80 167 41.75 
2 42 26 25 106 199 49.15 
3 84 21 19 83 207 51.75 
4 26 60 36 * 69 191 47.75 
5 14 35 44 48 141 35.25 
6 16 80 28 76 200 50.00 
7 29 49 80 39 197 49.25 
8 32 38 76 84 230 57.50 
9 45 65 15 91 216 54.00 
10 30 71 82 39 222 55.50 
T 349 487 419 715 Т= 1,970 
Ж; 34.90 48.70 41.90 71.50 


49.25 
ЕЕ. 


[4 R 
Y7:-3450 YTi-105:56 Ӱ Y Xn? = 122,984 


Pad с=1 r=1 с=1 


Rows 


Columns 


Interaction 


Total 


Table 16.8 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 243 


ments. The data are presumed to relate to a random sample of individuals 
tested under different treatment conditions. This is a mixed model. One 
basis of classification, the columns, is fixed. The other basis of classifica- 
tion, the rows, is random. T 

Applying the appropriate computation formulas, the following sums of | 


squares are obtained: к. 


; 
144 ,, T? 394350 (1,970): * 
p»? Т. 4 40 1,565.00 
1 & 7. T? 1,045,756 (1,970) 
RÈ о 10 7,553.10 
на 1 1 Т: 
Xt Ti— T+ 
Baek RA TN | 
= 122.984 394,350 _ 1,045,756 4 
4 10 1 
(1,970)? _ 
T = 1654500 
РИ (1.970)? \ 
X,2— ^ = 122,984 = 25,961.50 | 


Table 16.8 summarizes the analysis-of-variance data for this example. | 
Because this is a mixed model with n = 1 and Ё, = s,2/s = .219, no mean- | 
ingful test of row effects is possible. The proper error term for column | 
effects is 57. The F ratio for column effects is found to be 4.04. The F _ 
ratios required for significance with 3 and 27 degrees of freedom associated | 
with the numerator and denominator, respectively, are 2.96 at the 5 per | 
cent and 4.60 at the 1 per cent levels. Thus the column differences are | 
significant at the 5 per cent level but fall short of significance at the 1 per ! 


cent level. : 
3 

Analysis of variance for data of Table 16.7 ; 

Source of Sum of Degrees of Variance 4 

variation squares freedom estimate 1 
2 

Rows 1,565.00 9 173.89 = s,* P Г 

Columns 7,553.10 3 2,517.70 = s? + 

Interaction 16,843.40 27 623.83 = s? 

Total 25,961.50 

UR E 
Е, толе Hen 219 


EXERCISES 


THE DESIGN OF EXPERIMENTS 
ы 
1 Іп an experiment involving double classification w 


ith 10 observations 
in each cell, the following cell and marginal mean 


$ were obtained: 


C, с. €; 


9.6 


9.9 


10.4 3.9 150 98 


Compute (a) the cell means ex; 


) pected under zero interaction and (b) 
the interaction sum of squares, 


2 The following are measurements made on a sample of 12 subjects 
under three experimental conditions: 


Condition 

Subject ©; с. C; 
zu LN NEN de x ND 

1 8 7 15 

2 19 n 20 

3 7 9 6 

4 23 20 18 

5 14 26 12 

6 6 14 15 

7 5 9 20 

8 22 25 20 

9 1 15 16 

10 4 12 8 

11 13 18 20 

12 8 6 28 

T. 140 175 198 
X. 1167 — 1458 — 1650 


Obtain the sums of squares and (| 


he vari 
А ance esti 
column means on the assumption that мае. Test the 


experi и 
fixed variable. тшегш condition is a 


3 The following are data for a double-classif, 


Cation . Қ Е 
two fixed variables: experiment involving 


— —y¥ -— 


ANALYSIS OF VARIANCE: TWO-WAY CLASSIFICATION 245 


с, C: C; 7 


Apply the analysis of variance to test the significance of row, column, 
and interaction effects. - 


4 The following are data with unequal numbers in the subclasses: М 


с, с. 


Apply the analysis of variance to test row, column, and interaction | 
effect, on the assumption that the two experimental variables are fixed. 1 


246 


17.1 


ANALYSIS OF VARIANCE: 
THREE-WAY CLASSIFICATION 


INTRODUCTION 


Many experiments involve the simultaneous study of more than two 
independent variables or factors. For example, an experiment may involve 
two levels or categories of one factor, three of a second factor, and five of a 
third factor, with n subjects assigned at random to each of the 2 X 3x 5 
groups. Such an experiment may be spoken of as a “2X 3X 5 factorial 
experiment." The data from such an experiment may be conceptualized as 
a three-dimensional cube of numbers containing two rows, three columns, 
and five layers, with n observations in eack of the 30 different cells of which 
the cube may be thought to be comprised. Another experiment may involve 
two levels or categories of one factor and three levels or categories of 
another factor, with each of the 2 X 3 conditions being administered to 
each of № subjects. This is a repeated-measurement design, with each 
subject receiving the six different combinations of experimental treat- 
ments. Here again the data may be represented by a cube of numbers with 
one observation or measurement in each of the2x 3x N 

The analysis and interpretation of data resulting from such experiments 
are a direct extension of the analysis and interpretation of data for two-way 
classification. In a two-factor experiment with n observations in each cell 
the total sum of squares is divided into four parts, a between-rows, a 
between-columns, an interaction, and a within-cells sum of squares, 
In a three-way classification, or three-factor, experiment with n ob- 
servations per cell, the total sum of squares is partitioned into eight parts, 
three sums of squares for main effects, four interaction sums of squares, 
and a within-cells sum of squares. Each sum of squares has an associated 


cells. 


17.2 


A0 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 247 [ 
Ф 


number of degrees of freedom. Sums of squares аге, as previously, divided - 
by their associated degrees of freedom to obtain variance estimates, or 4 
mean squares, which are used to test the significance of main effects and ; 
interactions. Although such analysis and interpretation are clearly more | 
complex than for two-way classification, the essential ideas are the same. 


NOTATION FOR THREE-WAY ANALYSIS OF VARIANCE 


Consider an experiment involving R levels of one factor, C levels of a | 
second factor, and L levels of a third factor. The number of treatment com- | 
binations is RCL. Consider the particular case where we have опе mea- | 
surement, or observation, for each of the RCL combinations, the total - 


number of measurements being N. The data for the first layer of numbers * 

may be represented as follows: у 

В 

^ 

uy 

Ж 

t 

T “ў 

ТЫ І D 

5 5 1 

Pc 

жм 

ad 

B LI 

я м 
Соштп кл pm = 

тап Ха Ха Хи --:- Ха X, 
я 


Here the first subscript identifies the row, the second the column, and the 1 

third the layer. Thus, for example, Хз», denotes the observation in the third 

row and second column of the first layer. The mean X, 113 the mean of the 

first row of the first layer, X ү, is the mean of the first column of the first _ 

layer, and X., is the mean of all the observations in the first layer. 
Notation for the second layer is as follows: 


1 $ 
2 М 
ч қ 
5 3 у 
$ C 
ES 
R 
Column 


248 


THE DESIGN OF EXPERIMENTS 


Similarly the third, fourth, and Lth layers may be considered. In general 
Х,а denotes a measurement for the rth row, the cth column, and the АҺ 
layer. The reader should note that R denotes the number of rows, C the 
number of columns, and L the number of layers. The symbol r denotes the 
rth row, where г may take the values 1, 2,..., К. Similarly c and / 
denote the cth column and the /th layer, respectively. 

The grand mean of all the RCL = N observations is X... The total sum of 
squares of deviations about the grand mean is given by 

7 L 


SES 0-Х.) 


r=1 c-1 ісі 


In many experiments there аге п sampling units and measurements in each 
of the RCL treatment combinations. The total number of measurements is 
then nRCL = М. The notation for such data in the particular case where 
R=2,C=3,L=2, andn=3 may be represented as follows: 


1 
T 
$ 
$ 
ә 
2 
1 
N 
5 
$ 
= 


Quadruple subscripts are used. The first identifies the row, the second the 
column, the third the layer, and the fourth the measurement within the 
cell. Thus, for example, the symbol Хоз indicates the third measurement 
in the cell, corresponding to the first row, the third column, and the second 


M 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 249 : 


layer. In general АХ denotes the ith measurement in the rth row and cth - 
column of the /th layer, where i= 1, 2, . . . , n. Row and column means 
for each layer are shown. Thus X, 1. is the mean for the first row of the first 
layer, Хз», is the mean for the second column of the second layer, and so 
on. The grand mean, the mean of all nRCL observations, is X... The total 
sum of squares for a triple-classification experiment with n observations 
per cell may be denoted by 

ey 


т=1 с=1 1=1 Ё 


(Хи — X. 


М 


The sum of squares, both in the case where л = 1 and where n > 1, may be 
partitioned into additive sums of squares. 


17.3 PARTITIO 


G THE SUM OF SQUARES 


With a single measurement in each of the RCL-treatment combinations,’ 
the total sum of squares may be partitioned into seven additive parts, three 
main effects for rows, columns, and layers, three first-order interaction 
terms, and one second-order interaction term. The first-order interaction 
terms are row by column, row by layer, and column by layer. The 
second-order interaction term is row by column by layer. The precise 
meaning which attaches to each of these terms is described in detail in 
Sec. 17.5. With more than one observation per cell, n > 1, the total sum of 
squares is partitioned into eight additive parts: three main effects, three. 
first-order interaction terms, one second-order interaction term, and one- 
within-cells sum of squares. 

The procedure used in partitioning the sum of squares is directly analo- г 
gous to that used in the two-way classification case, although somewhat 
more complex. For n > 1 we begin with the rather involved identity, 


(Хин — X.) = Ce = X...) + Ca X.) 
"LL Dt Us X 
B —X. X 
+Ж + (ea 


As in the two-way classification case both sides are squared, summed | 
over К rows, С columns, and L layers, and over the п observations in each 
cell. Certain terms contain a sum of deviations about a mean. These 
vanish, and the eight sums of squares as shown in Table 17.1 result. Note y 
that three sums of squares for main effects are obtained, three first-order - 
interaction sums of squares, one second-order interaction sum of squares, | 
and one within-cells sum of squares. With one observation in each cell the 
required sums of squares are obtained from Table 17.1 simply by writing 
n = 1. No within-cells sum of squares exists, and in this case the total sum 
of squares is partitioned into seven, and not eight, additive parts, 


pro], 


1— TJY" 
nO (1— и) 125 вәә щцид\ 
1s t= p=- 
== у Е) TRIKA 
ыз а-т(1-> 1290 
0 а-та-ж уч 
ps (-2)(— №) d 
JS D s1oKe'] 
es esp suuin[o7) 
T iw SMOM 
әрушінә Jp soannbs fo wung 921105. 


эзил д 


Т<и [22 ләй SIMU и ҷим попвәциянвүо Хам-әәлці 10} ээцвыва JO 915 А1ецу 


UZI әче, 


250 


Ka sd ҰК" TAMEN. rig "OF eral 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 251 


17.4 DEGREES OF FREEDOM AND MEAN SQUARES 


c wis 


As shown in Table 17.1 the number of degrees of freedom for rows is - 
R — 1, for columns C — 1, and for layers L — 1. The first-order interaction y 
for row-by-column has (В — 1) (C — 1) degrees of freedom. Similarly the 
row-by-layer and column-by-layer interactions have (К — 1)(L — 1) and | 
(C — 1)(L — 1) degrees of freedom, respectively. The second-order | 
| interaction term has (R — 1)(C — 1) (L — 1) degrees of freedom. Each 
cell has associated with И п — 1 degrees of freedom. Since there are RCL 
cells, the total number of degrees of freedom associated with the within- - 
cells sum of squares is RCL(n — 1). The degrees of freedom are directly. 
| additive, that is, it may be shown that 


N—1=nRCL—1=(R-1) + (C-1)+ (L—1) 
| +(R—1)(C—1) + (R—-1)(LE—1) + (C-1)(L—1) $ 
| + (R—1)(C—1)(L—1) + RCL(n — 1) 


For п = 1 the degrees of freedom are the same as for n > 1 except that no 
within-cells sum of squares exists. 

| Sums of squares are divided by their associated number of degrees of | 
freedom to obtain variance estimates, or mean squares. As previously, 
ratios are formed using these mean squares to obtain tests of significanc 
for main effects and interaction effects. As in the two-way classification 
case, the correct choice of error term, the appropriate mean square to 
insert in the denominator of the F ratio, depends on whether the model 
appropriate to the experiment is fixed, random, or mixed. 


17.5 INTERACTION IN THREE-WAY ANALYSIS OF VARIANCE 


As stated above, partitioning a sum of squares in a three-way analysis of 
variance results in four interaction sums of squares, R X C, R X L, C X 
and R X С X L. What meaning may be attached to these sums of squares 
How may they be interpreted? The answers to these questions may be clar- 
ified by a hypothetical example. The following are cell means fora 3 X 3 X 
2 factorial design with n observations per cell. 


Layer 2 
Columns 


Rows 


тү о 


252 


THE DESIGN OF EXPERIMENTS 


The data to the left are means for the first layer, and those to the right are 
means for the second layer. These data may be visualized as a 3 X 3x 2 
cube of numbers with the second laye: superimposed on the first as shown 
in Fig. 17.1. The means shown in Fig. 17.1 are means on the surface of the 
cube obtained by averaging cell means over rows, columns, and layers, a 
procedure which is, of course, correct only when the cell means are based 
on equal n's. Thus in Fig. 17.1 the mean of 13 for the first row and column is 
obtained by averaging the means for the first row and column over the two 
layers. Thus the mean of 13 is the average of the two-cell means 10 and 16. 
Likewise the mean of 10 for the first column and the first layer is obtained 
by averaging over rows. Thus 10 is the average of the three-cell means for 
the first layer, 10, 5, 15. All other means in Fig. 17.1 are similarly 
obtained. Fig. 17.1 also shows certain marginal means, or means along the 
edge of the cube. These means are obtained by averaging over rows, 
columns, and layers. The grand mean, the mean of all the observations 
taken together, is also shown, which in this case is 15. 

The first-order interactions in a three-way analysis of variance, that is, the 
RXC,RXL,andCXL interactions, are concerned with the parallelism, 
or its absence, of the means on the surface of the cube, The rationale is 
essentially the same as that used in the double-classification case. Con- 


sider for example the means obtained by averaging over layers in Fig. 17.1 
These means are 


Columns 
1 П ІШ 


interaction is zero. 
As in the two- 


way classification case the values of the cell means for 
rows and colum 


ns expected in the case of Zero interaction are given by 


Fig. 17.1 


Pr ый . р, эы 3193 


HFS E a $. 
ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 253 


Representationofdatafor three-way classification asa cubeof numbers. 


Fk p eo dX. — X.. The observed means are X, 
ferences between the observed and expected are Х,. - 
+ X... The R X C interaction sum of squares is given by 


nb $ Ser Xu Ke. +X.) 


This is seen to be nL times the sum of squares of these differences summed | 
over rows and columns, and is simply an over-all measure of the departures 
of ‘the means from parallelism. Note that the R X C interaction is con- 
cerned with parallelism among means obtained by averaging over layers. | 
The R X L and L X C interactions involve means obtained by averaging | 
over columns and rows, respectively. 

The R X СХ L interaction in a three-way analysis of variance is con- 
cerned with the similarity, or otherwise, of the interaction, whether it Бе 
zero or not, between each pair of variables at different levels of the third | 
variable. Consider the following means for а 2 X 2 x 2 factorial experi- 
ment: 


R, 10 30 R, 20 40 


254 


17.6 


THE DESIGN OF EXPERIMENTS 


Inspection of the means for L, indicate that the means are not parallel and 
an interaction is present. Likewise inspection of the means for І, indicates 
again that the means are not parallel and an interaction is present. We 
note, however, that for L, the difference between R, and К» for C, is zero. 
The corresponding difference for Із is also zero. Note also that for L, the 
difference between R, and К, for C, is 10. The corresponding difference for 
Lz is also 10. It is appropriate to say here that the interaction between rows 
and columns is the same, or geometrically similar, for the two layers. In 
this case the triple interaction will be zero. In this illustrative example it 
may be ascertained that the R x C interaction is nonzero whereas the 
R X L, the C X L, and the R x C X L interactions are zero. 

The R X C X L interaction is a measure of the differences between a set 
of observed values and a set of expected values, The expected values are 
those we would obtain if the interactions were the same between all pairs 
of variables at different levels of the third variable. 


EXPECTATIONS OF MEAN SQUARES IN 
THREE-WAY ANALYSIS OF VARIANCE 


As in the two-way analysis of variance the correct choice of error term ina 
three-way analysis of variance depends on the nature of the variables 


fixed and the third random, or any one may be fixed 
random. Thus we may have a fixed, random, or mixed model, 

The expectations of mean squares for the ge 
in Table 17.2. The reader should note that in 


| 8 may be stated as H,:o,? = 0. Like- 
wise оу? and gÈ are the variances in the population for column and layer 
means, respectively. 


From the general model of Table 17.2 


‚ the random model, or any of the 


- These expectati 
in fable 174 pectations are shown 


For the mixed model where rows are random and columns and layers are 
fixed, the expectations of the mean squares are obtained by substituting 
(Rp — R)/R, = 1, (C, — C)/C, = 0, and (L, — L)IL, = От the expectation 
for the general finite model given in Table 17.2. The е 


Е 


го 
гри +72 
“очу + ои “н т +20 
4—4 
group d "ои 92 +272 
2—'2 
"ит + ou “т +20 
MT 
d, Lj а, а, 
2 У 2 У 
ди, "ou, оиу — — мои — — +779 
тта И rpms z 2-94-* 
"T “ч ZIN EA 
MOuTY + zou + Pou 9 ou 20 
pe be coat 44-4 z Туву +: 1—13-*3** 
d, d, а, а, 
1 2 2 1 
Pou, pu, Du! pw ou. 20 
$9197 + z 21-4 z Тоу s 2-91-41” 


2.5 's[[99 шуиду 
JsS'Tx9xMW 
21 *7х 2 
ESTAY 
S'I XYA 

215 *ВләАв'] 

zS *вишпүогу 


25 ‘SMO 


24nnbs uvau fo uonpjoodxq 


1199 Чаво ur soLriuo и цим sisquue Авм-э9441 цәрош эииу [езәиәй 10} вәлепЬв ивәш 30 suoneioodXq Фар әче, 


aapnbs uvay 


256 


Table 17.3 


Table 17.4 


17.7 


THE DESIGN OF EXPERIMENTS 


Expectation of mean squares for fixed model: three-way 


analysis of variance with n entries in each cell 


Mean square Expectation of mean squares 
ee e 
Rows, 5,2 с? + LCno,? 
Columns, 5,2 ое? + Кїлє? 
Layers, 5/2 ое? + КСпо 2 

В ХС, s,<2 TÈ + Laoag? 

RX L, н? Tè + Cnoac? 

C X L, Sei? с? + Епа,2 
RXCXL, Sr? TÈ + NTa? 

Within cells, 5,2 T? 


Ва. 


Expectation of mean squares for random model: three-way analysis 


of variance with n entries in each cell 


Mean square Expectation of mean squares 
Rows, 5,2 Tè + поа? + [пов + Спо, + LCno,2 
Columns, 5,2 Tè + пса? + пса? + Racy? + RLnaoy? 
Layers, s? OF + naue  Cnos + Rnoy? + КСпо,2 
КХС, 5,2 TÈ + пса? + пса? 

RX L, з, Tè + пса? + Спок? 

CXL, sê Tè + пса? + Rnoy? 

RXCXL, зыў TE + поа? 

Within cells, 5,2 oe 


ке —————_. 


particular mixed model are shown in Т 


tie n able 17.5. The expectations for other 
varieties of the mixed-model case can 


be similarly obtained, 


CHOICE OF ERROR TERM 


ratios. 


" 


а... 


Table 17.5 


1 


17.8 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 257 


Expectation of mean squares for mixed model: rows are 


random, columns and layers are fixed 


Mean square Expectation of mean squares 
————————— 
Rows, 5,2 с? + LCno,” 

Columns, s oe + Lnog,?  RLnoy? 
Layers, s? сё? + Споас? + RCno,? 

Е Соза? а? + [пд 

RXL, s,* oe + Споа? 

СБ за” Oe + пса? + Кпоъ? 
RXGKL, ре? Tè + поа? 

Within cells, 5,2 се 


---------------------------------- 


Fixed model: n > 1. Тһе correct error term for testing all main effects and 


all interaction effects is 5,7. All F ratios are formed with Sw? in the denomi- | | 


nator. 


Random model: n > 1. The correct error term for testing the R X C X L in- 

teraction is 5,2. The correct error term for testing the R С, К X L, and 
С х L interactions is 5:47. No exact tests of the main effects can be made, 

although approximate F tests can be used. For a discussion of such approx- 

imate tests, see Scheffé (1959). 


Mixed model: n > 1. Inspection of the expectations in the particular 
mixed-model case where rows are random and columns and layers are 
fixed indicates that the correct error term for testing rows, R X C, R X L, 
and R X C X L interactions is the within-cells variance estimate, 5,2. The 
correct error term for testing the column effect is the R X C interaction 
mean square, 5,7, and for the layer effect the R X L interaction mean 
square, 5,2. The correct error term for testing the C X L interaction is the. 
R X C X L interaction mean square, 5,47. The correct error term for other 
examples of the mixed-model case will, of course, differ from the above, 
but may be deduced from the general finite model. 


COMPUTATION FORMULAS FOR SUMS OF SQUARES 


As previously, computation formulas are used to calculate the required 
sums of squares. The sum of all N observations is denoted by T. The sum of 
all the observations for rows, summed within cells and then over columns 
and blocks, is Т,.. If we conceptualize the data as comprising а cube of 
observations, Т, are totals along the edge of the cube. Likewise Т. are 
totals summed over rows and layers, and T. are totals summed over rows 
and columns. 


3 


NOV eee oh OPE TRES war E TCR 


THE DESIGN OF EXPERIMENTS 


The quantities Ту, Т,,, and Т.а. are totals summed within cells, and 
then over layers, columns, and rows, respectively. These are totals on the 
surface of the cube. Туе. is an individual cell total corresponding to the rth 
row, the cth column, and the /th layer. 


With n entries in each cell the computation formulas for the sums of 
squares are as follows: 


lk e m 
па Ford Th. ap 


T € 12 
а) Е Ww 


R X € Interaction 
y 


[17.3] „єў т -2 
а DT e E 
[17.5] Ўт Bsn D> T: m 


R C Lm lee. 

КЭУ) а SST ү. 
О д Ка Т? 

nza 5 УУ 


КООРТ КҮ CS К. 


17.9 


Table 17.6 


writing п = 1 in the computation formulas given above. 


ILLUSTRATIVE EXAMPLE OF THREE-WAY CLASSIFICATION 


Table 17.6 shows data for a 2 X 3 X 2 factorial experiment. Early- and 
late-blinded animals belonging to three strains, bright, mixed, and dull, 
were reared under two environmental conditions, free and restricted. 


Data for the analysis of variance with three-way classification: n > 1, error scores 
for early- and late-blinded rats for three strains reared under two environmental. 
conditions: cell totals shown in parentheses [ 


Early blinded 
Strain 
Environment Bright Mixed & 
27 22 31 37 
Free 45 18 52 45 
76 33 86 66 104 126 
(221) (317) 
55 40 71 16 132 104. 
Restricted 81 50 98 68 
36 70 42 104 
(332) (465) 
Late blinded 
Strain 
Environment Bright ^ Mixed Dull 
61 39 61 71 140 122 
Free 76 60 82 92 99 92 
46 59 103 105 68 10 
(341) (514) (622) 
ES: 
88 92 100 120 142 150 | 
Restricted 95 103 120 131 96 105 
51 73 89 76 80 125 
(502) (636) (608) — 
M m 


сары асыға дарағы асына асан 


THE DESIGN OF EXPERIMENTS 


Twelve groups of animals were used, each group comprised of six animals. 
In this example rows are environments, columns are strains, and layers are 
early- versus late-blinded. In Table 17.6 the cell totals are shown in paren- 
theses. 

For computational purposes it is necessary to write down the totals for 
rows by columns summed over layers, rows by layers summed over 
columns, ‘and layers by columns summed over rows, If these data are 
visualized as a cube of numbers, these are the totals on the surface of the 
cube. The totals for rows by columns summed over layers, 


Tre.. аге as 
follows: 


Tre.. 
Columns 
Тт... 
2,523 
Rows 
3,266 


то. 1,396 1,932 2461 5,189 =Т_ 


4 The totals above are obtained very simply by addin 
layers, that is, for the early- and late- 
562, 317 + 514 = 831, and so оп. Tot 
Т.е. are also shown, together with th 
layers summed over columns, Т, 


g the cell totals over 
blinded animals. Thus 221 + 341 = 


als for rows, Т,..., aud for columns, 


e grand total, T... Totals for rows by 
л are as follows: 


Layers 


1,430 | 1,836 | 3,266 


T, 2476 3313 5,189 =Т_ 


Rows 


Here 1,046 is obtained by summin 
columns; thus, 221 + 317 + 508 
and so on. Totals for layers b 
follows: 


£ the cell totals for the first layer over 
= 1,046. Likewise 332 + 465 + 633 = 1,430, 


y columns summed over rows, Т, are as 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 261 


Ta. 3 
Columns 


T. 
2,476 


Layers 
843 | 1,150 | 1,320 | 3,313 


T.. 1,396 1,932 2,461 5,789— T. 


Here 553 is obtained by summing the cell totals for the first column over 
rows; thus, 221 + 332 = 553, 317 + 465 = 782, and so on. I 
From the data as arranged above nine quantities are calculated which 
are the different terms in the computation formulas. These quantities in 

this illustrative example are as follows: 


e A _ 

Ar Ў Т. = FHKE х 17.032.285 = 473,119 
te es М " 

ЕТ. У Т®.= gx zxz Х 11,737,961 = 489,082 

1, 

ES уг, = ТТЕРІ X 17,106,545 = 475,182 
1 гс 1 

2 ЎЎ т. = qug X 5,962,623 = 496,885 

R L 

ўт ua х 8,691,441 = 482,858 
pug 2 1 В 

aR p» % Та. = 6х2 X 5,994,763 = 499,564 


€ L 
S У Т. -1 Х 3,045,597 -- 507,600 


Me ІМ» 


E 


з М» ale 


Y Ў Жи = 536,369 
1 


х 33,512,521 = 465,452 


=| 
8 


R 
In the above computation the quantity У T}. = 2,523? + 3,266° = 


17,032,285, Y Т. = 1,396? + 1,932? + 2,461? = 11,737,961, and so on. 


Applying the computation formulas the required sums of squares are then 
as follows: 


262 THE DESIGN OF EXPERIMENTS 


> T: = 473,119 — 465,452 = 7,667 

Rows LC х8 N > » » 
a > T... — —- = 489,082 — 465,452 = 23,630 

Columns RL 38 N » > ы 

Layers -pr > 1 — — = 415,182 — 465,452 = 9,730 
nRC = п * d T 


R xC Interaction ^ 
1 E€.c Т 2 " L 16 Ж 2” 
aL У У Т. с È Th. "ЕЕ XT 


= 496,885 — 473,119 — 489,082 + 465,452 = 136 


R XL Interaction 


= 482,858 — 473,119 — 475,182 + 465,452 =9 
€ X L Interaction 


a DA 0 I c 28.14 L Г 
ЖЕЗ УТЫ rg. ARE È T. + 


= 499,564 — 489,082 — 415,182 + 465,452 = 752 
RXCXL Interaction 


18 €. c iy Же 
д => 5 wap DDT oT Yn. 
ел 1 та gl óp т 
E CL У Т? RE У Т® пЕЄ Ў, T. – 
= 507,600 — 496,885 — 482,858 — 499,564 + 473,119 
at. . + 489,082 + 475,182 — 465,452 = 224 


А ROC L m 2 та @ в А 
[Within celis y Y S у, — л 2 У У Ta. = 536,369 — 507,600 = 28,769 


B CoL % T? 
reti Ай = s — 465,452 = 70,917 
P 536,369 


The analysis of variance table for these data is 
for rows is R-1=2—]= 1; for columns, C 
K 1—1=2-1=1; Е x Cinteractions, (К 
Р = 2; Юг R X 1, interactions, (К-1)(1, 
А С X L interactions, (C — 1) (L — 1) = 
interactions, (R — 1) (C — 1) (L 


given in Table 17.7. The df 
—1=3—1=2; for layers, 
—1)(С—1)=(2—1)(3—1) 
=1)=(2—1)(2—1)= 1; for 
(3-1)(2—1) 22; ftr RX CXL 
-D)-0-0(-1)92-1 23; and for 


Table 17.7 


17.10 


е: 


" ч "y 
ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 263 


Analysis of variance for the data of Table 17.6 


Source of Sum of Degreesiof Variance 
variation squares freedom estimate 
Rows 7,667 1 7,667 = 5,2 
(environments) , 
Columns 23,630 2 11,815 = s2 
(strain) 
Layers 9,730 1 9,730 = 512. 
(early vs. late) 

RxC 136 2 PEN _ 
RXL 9 1 9=s,2 
CxL 752 2 376 = sa? | 
RxCxL 224 2 112 = s i3 
Within cells 28,769 60 479 = 5,2 
Total 70,917 71 


within cells, RCL (n — 1) = 12 5 = 60. In this illustrative sample all three | 
variables may be viewed as fixed, and s,? becomes the appropriate error. 
term for testing all effects. The following F ratio may be calculated: 

3 


% 15667 
= i < .01 
ето в ООН 
ет Gee Ы 
Кеч з 479 PRAT И Ae Wo, 
F, == == = 20.31 р < .01 “ Library <, 
HOC IN ^ 5 &| 
sr a LM LIC. Mea: 
Ра= 5 Gon 02 p.05 NS Calcutta ò 
Ж ? к 
548. 316 x Ха с 
Fa [os 479 178 р > .05 S 
eu 
Fm = 979 = 23 р> .05 | 


In this illustrative example the difference in environments, strains, and - 
early versus late blinded are highly significant. All interaction effects are - 
not significant. In fact all interaction terms are somewhat less than would 
ordinarily be expected on a chance basis. 


UNEQUAL NUMBERS IN THE SUBCLASSES i. 


Difficulties associated with unequal numbers in the subclasses discussed 
in Sec. 16.11 apply also in the three-way classification cases, For small 


264 


17.11 


THE DESIGN OF EXPERIMENTS 


departures from equality the method of expected equal frequencies as dis- 
cussed in Sec. 16.11 may be used. The least-squares solution to the 
problem of unequal n’s in the subclasses is described by Bancroft (1968). In 
general it is advisable to avoid unequal n’s wherever possible. 


TWO-FACTOR EXPERIMENTS WITH 
REPEATED MEASUREMENTS 


In Secs. 16.12 and 16.13 one-factor experiments with repeated measure- 
ments were considered. On occasion experiments are encountered that 
involve two factors with repeated measurements. Given R levels of one 
treatment and С levels of another, each subject may be tested under each of 
the RC treatments. If К = 2 and С = 2, the levels of R being R, and R, and 
of C being C, and С;, there are four treatment combinations, R,C,, ЁС», 
К.С, and К.С». Each of N subjects might receive all of the four treatments, 
the presentations being possibly, although not necessarily, arranged in 
random order for each subject. 

Such data constitute an RCN block of numbers. Rows and columns are 
treatments, and layers are experimental subjects. These data are analyzed 
as in the triple-classification case with one observation in each cell. Use the 
computation formulas given in Sec. 17.8, writing n = 1. Seven sums of 
squares result: row, column, subjects, R X C, R x 5,Сх5, апа ЕхСх 
5. There is, of course, по within-cells sum of squares. 

The model here is a mixed model with n= 1. Rows and columns will 
ordinarily be fixed variables. Layers, or subjects, 
this model the expectation of the sums of 5чи 
freedom are as follows: 


is arandom variable. For 
ares and the degrees of 


Mean square Expectation of mean square df 

Rows, 5,2 с? + Cog? + NCao,* R-1 

Columns, 5,2 се? + Roy? + М№Коь? С-і 

Subjects, 5,2 c. + RCo? М-і 

КХС, sr? Tè + бак? + Мо (К —1)(C— 1) 

R XS, sp? сг? + Со? (R-1)(N-1) 

CXS, se? сг? + Roy? (C—1)(N — 1) 

RXC XS, Spex? Te + Care? (R—=1)(C—1)(N—1) 


e + 


Inspection of these expectations indicates that the appropriate error term 
for testing row effects is the R X S mean square, F = 5,2 


priate error term for testing column effects is the C X S m 
52215,2, The appropriate error term i i ion i 


R X C X S mean square, Fi 515,2. Unless theR $, Сх S, and F x 
ro, which with most sets of data will 


15°. The appro- 


- 


ANALYSIS OF VARIANCE: THREE-WAY CLASSIFICATION 26 


52 


not be the case, no unbiased tests of difference between subjects, or К X S. 
or C X S interactions, can be made. These are ordinarily not of interest. 


EXERCISES 1 In a 5X4X3 factorial experiment with 10 observations per cell, 
calculate the number of degrees of freedom associated with (a) the 
main effects, (b) interaction sums of squares, and (с) within-cells sums 
of squares. А 


Given the following cell means with five observations per cell: 


Calculate the R X C, R X L, C X L, and R X C XL interactions. . 


3 Ina three-way analysis of variance what is the correct error term for 
testing (a) the first-order interaction mean squares when all three vari- 
ables are random, (0) the column sum of squares when rows are 

| E random and columns and layers are fixed, (c) the second-order interac- 

tion when all three variables are fixed. 


4 The following are data for a 2 X 2 X 2 factorial experiment with si 
observations in each of the eight experimental treatments. 


R, 


R, 


VIP IM с 


a Se NS 


— 


ж” 


ONLUS зсіз халал РА 


R, 


R, 


THE DESIGN OF EXPERIMENTS 


Calculate (a) the sums of squares and (b) the mean squares for these 
data. On the assumption that the three variables are fixed, (c) test the 
significance of the main effects and interaction effects. 


The following are data for a 3 X 3 X 2 factorial experiment with nine 


observations in each of the 18 combinations of experimental treat- 
ments. 


25 22 19 16 10 31 9: 185 
16 31 12 12 19 30 10 15 12 


Calculate (а) the sums of squares and (5) the mean squares for these 
data. On the assumption that the three variables are fixed, (c) test the 
significance of the main effects and interaction effects, 


6 The following are data for a repeated-measurements experiment in 
which four experimental subjects are tested under four treatment co 


binations. 


Subject 1 Subject 2 
с, с. с, 6; 


19 


22 


Subject 3 Subject 4 
с, с. 


Apply an analysis of variance to these data, and test the significance 


row and column effects. 


‘ 


268 


18.1 


MULTIPLE COMPARISONS 


INTRODUCTION 


On comparing А means the analysis of variance may lead to a significant F 
test. A meaningful interpretation of the data may require a comparison of 
pairs of means. The differences between some pairs may be significant, 
while those between others may not be. 
compare selected pairs of means, 
other mean, there being k(k — 1)/2 such comparisons, 


The investigator may wish to 


common use, using either the F 
test or a statistic known as the "studentized range,’ 


parison error rate and experimentwise error rate. The per-comparison error 


18.2 


[18.1] 


MULTIPLE COMPARISONS 269 Э 


А distinction is commonly made between a priori comparisons and a 
posteriori or postmortem comparisons. А priori comparisons are formulated 
prior to, and quite apart from, an inspection of the data. For example, given 
an experiment involving the comparison of four means, the investigator . 
may decide in advance to test X, — X, A posteriori comparisons area 
formulated after, and may be suggested by, inspection of the data. From a 
logical viewpoint a priori comparisons may be applied whether or not the F- 
test has led to the rejection of the null hypothesis. A posteriori tests are 
applied only following a significant F test. From a practical point of view | 
the utility of the distinction between a priori and a posteriori comparisons 
may be questioned. Ordinarily in the analysis of most sets of experimental . 
data no way exists for ascertaining whether certain comparisons were : 
formulated on an a priori basis or not. : 

When more than one comparison is made, a distinction is drawn ч 
between orthogonal and nonorthogonal comparisons. Orthogonal compari- 
sons are independent of each other. It is convenient to conceptualize a 
comparison as а weighted sum of means. Thus for a set of four means the $ 
difference X, — Х is a weighted sum of X,, X», Хз, and X,, the weights 
being 1, —1, 0, and 0. Also the difference Хз — X, is a weighted sum, the 
weights being 0, 0, 1, and —1. In general, for equal n's two comparisons аге | 
said to be orthogonal when the sum of products of the paired weights is 
zero. The comparisons X,— X, and X,— X, are orthogonal because | 
(1) (0) + (71) (0) + (0) (1) + (0) (—1) = 0. If the sum of products of 
weights is not zero, the comparisons are not orthogonal. They are not - 
independent of each other. For further discussion of orthogonality see à 


Chap. 19. 
1 


MULTIPLE COMPARISONS USING THE t TEST 


For ortho, 
simple ¢ test 
within-group và 
number of degrees о! 
groups only. The requir 


= - ea) 2” і 
Узап + Зи па | 
from the data of Table 15.2 the means for samples I and Hare | 


pectively. The within-group variance, based on 22 
is 3.91. The numbers in the two groups are 8 and 5. 


gonal a priori comparisons between pairs of means we apply a 
to the differences between pairs of means, using the 
riance estimate, 5,7. This estimate is based on a larger 
f freedom than a variance estimate obtained from two 
ed value of t is given by: 


To illustrate, 
5.38 and 8.40, res 
degrees of freedom, 
The г test is then 


12538840 =-2.68 
3918 + 3.91/5 


We consult a table of t for df = 22. The values required for significance at 


_ 270 


THE DESIGN OF EXPERIMENTS 


the .05 and .01 levels are 2.07 and 2.82, respectively. The observed ¢ is 
between the .05 and .01 levels. Because of its restriction to a priori 
orthogonal comparisons, the ¢ test method has limited application. 


MULTIPLE COMPARISONS USING THE F TEST 


The method described here is due to Scheffe. This method uses the crite- 
rion that the probability of rejecting the null hypothesis when it is true, a 


alue of F required for 
significance at the .05 or -01, or any desired level, for df, =k — 1 and dh = 


N-k. Third, calculate a quantity F’, which is k 
for significance at the desired significance level; that is, Ё' = (k — 1)F. 
Fourth, compare the values of F and F’, 


Comparison F 

Ы: 
I, II 7.18 
I, III .55 
LIV 4.97 
II, III 3.81 
П, IV 20.34 
III, IV 8.18 
————— 


groups combined, 
€ second two groups, 
case is 


Хуш = (nX, T njX;)] (n, + п»), with the mean for th 
X344 = (nX; + 4X4)/(n3 + n4). The F ratio in this 


[18.2] 


18.4 


[18.3] 


MULTIPLE COMPARISONS 271 
' 


(Xis жс Xu): | 
5.210 + ne) + зы [(пз + n4) 


Е 


For a discussion of the application of the Scheffé method to any or all pos- 
sible comparisons see Edwards (1968). 

The Scheffé method is more rigorous than other multiple comparison: 
methods with regard to Type I error. It will lead to fewer significant dif- 
ferences. It is easy to apply. No special problems arise because of unequal - 
тө. It uses the readily available F test. The criterion it employs in the | 
evaluation of the null hypothesis is simple and readily understood. It і not 
seriously affected by violations of the assumptions of normality and homo- 
geneity of variance, unless these are gross. It can be used for making any 
comparison the investigator wishes to make. 

Concern may attach to the fact that the Scheffé procedure is more rigor- 
ous than other procedures, and will lead to fewer significant results. 
Because this is so, the investigator may choose to employ a less rigorous 
significance level in using the Scheffé procedure; that is, the .10 level may 
be used instead of the .05 level. This is Scheffé’s recommendation (1959). ~ 
Readily available tables of F do not ordinarily contain critical values at the 
.10 level. Tables showing the .10 critical values are given in Fisher and 
Yates (1963), and also in Winer (1962). 


тенә "oe. 


Ho ee ПИРУ рер LR 


ж 


- 


JA ot 


MULTIPLE COMPARISONS USING THE STUDENTIZED RANGE 


A number of methods for making multiple comparisons use a statistic. 
known as the studentized range. Given k treatment means based on equal 
ized range is simply the difference between the largest and. 
divided by an estimate of the standard error | 


быс Ne 


n’s, the student 
the smallest treatment means 


Ж MN. 2 


associated with a single treatment mean; that is, $ 
Q Xone Хан... Хлак— Xmm à 
Sz Vsw In * 


For different values of k, and different df associated with s,?, the sampling: 
distribution of Q is known. Table L shows the 95 and 99 percentile point 
for the distribution of 0 for different К and df. These are the values 
required for the rejection of the null hypothesis at the .05 and .01 levels. 
The null hypothesis in this case is Ниш = ua — > - + = py or that the 
samples came from a single population with mean д. The О test could quite 
clearly be used as an alternate to the usual F test. 5. 
Multiple-comparison methods based on the studentized ranges compare 
the observed studentized ranges for pairs of means with determined crite- 
rion values of the studentized range. If the observed value of Q exceeds the | 
criterion value of Q, the difference is said to be significant. Some of these 
methods are sequential. They involve ordering the means from low to high, 
The size of the difference required for significance changes with the sepa. 
ration of the means in the rank order. Other methods use a fixed interval: 


212 


THE DESIGN OF EXPERIMENTS 


that is, a fixed criterion value of Q for all comparisons is used, regardless of 
the separation of the means in the rank order. All multiple-comparison 
methods using the studentized range apply only to equal n’s. This detracts 
from their usefulness. 

One commonly used multiple-comparison method using the studentized 
range is the Newman-Keuls method. This method uses the criterion that 
the probability of rejecting the null hypothesis when it is true should not 
exceed .01 or .05 for all ordered pairs, regardless of the number of steps 
they are apart. To apply this method the means are ranked from low to 
high. The studentized ranges are obtained for all k(k — 1)/2 pairs of 
means, using зу = У/5,7/л, Criterion values of Q at the .01 or .05 
levels for £22, 8, ...., k, are obtained from Table L for the df 
associated with 5,2, Denote these criterion values as Qs, Q4... , Ox. 0. 


is the criterion value for comparing two adjacent means, Q; for means зер- 


arated by one intervening rank, О. for means separated by two intervening 
ranks, and Q, for means separated by k — 2 intervening ranks. Compari- 
sons of observed Q's with criterion values are made in a particular sequen- 
tial manner. Values in the first row of the table of Q are compared with the 
criterion values, proceeding from right to left until a nonsignificant dif- 
ference is found. No further comparisons are made in that row. This 
procedure is then applied to the second row, third row, and so on. 

To illustrate, consider the following means for five groups of equal size, 


n=6. 
1 П ІШ IV V 
кее —=——————.. 
п 6 6 6 6 6 
X, 12.56 9.54 3.00 645 в 


—  ————___ 


The analysis of variance table for these data is: 


Source Sum of squares df 


Variance estimate 
Between 304.02 4 76.01 = s? 
Within 262.50 25 10.50 = s,? 
Total 566.52 29 


Se, 


The overall F here is F = 76.01/10.50 
than the .01 level. The means аге arra; 
X, = 12.56. The order of the means i 
between every mean and every other 


= 7.24, which is significant at better 
nged in rank order from Xe = 3.00 to 
в Хз, X4, Xs Х», Ху. The difference 
mean is calculated. These differences 


MULTIPLE COMPARISONS 273 


are divided by sz = V Sw /n = 10.50/6 = 1.324 to obtain the studentized 
ranges Q, as follows: 


Table of Q 

ІШ ІР V H I 
m 2.61 4.02 4.92 7.22 
IV 1.41 2.33 4.61 
у 92 3.20 
II 2.28 
I 


If the .05 significance level is adopted, criterion values of Q for a k of 2, 3, 4, 
and 5 and df= 25 are obtained from Table L. These are Q, = 2.91, | 
0, = 3.52, Qs = 3.89, and Q; = 4.16. First, we compare the top row of the / 
above table of Q with the criterion values. The larger value 7.22 involves 
the comparison of Ш with I. The criterion here is Q; = 4.16. The observed | 
value exceeds the criterion value and is declared significant. The next 
largest value in this row is 4.92, which is compared with О. = 3.89 and is | 
found to be significant. The next largest is 4.02, which is compared with 
03 = 3.52 and is significant. The next value is 2.61, which when compared | 
with 2.91 is not significant. We now proceed to the second row of the Q 
table, and compare 4.61 with О. = 3.89, which is significant. The next 
value, 2.33, is not significant and comparisons for that row are discontin- - 
ued. Proceeding similarly, no other values in the table attain significance. | 
Thus, in this example, the differences between III and I, between III and 
П, between Ш and IV, and between IV and I are significant at the .05 level. 
The reader should note that the above procedure requires that compari- - 
sons be discontinued for any row at the first nonsignificant Q value. This | 
rule prevents the making of an inconsistent decision, that is, declaring that 

a larger difference is not significant whereas a smaller difference adjacent | 
to it is significant. This circumstance could arise on occasion with the 
Newman-Keuls method if all О values were compared directly with the cri- 
terion values. 

The Duncan method, sometimes called the Duncan new multiple-range | 
test, is a sequential test which is similar to the Newman-Keuls test. It 
differs, however, in the level of significance used. The Newman-Keuls test 
uses a significance level of .05 or .01 for k = 2, 3, . . . , k. The Duncan 
method uses a significance level 1 — (1 —a@)*~!, where a is ordinarily .05 or 
01. Thus for a = .01 for two adjacent means, Ё = 2 and the significance 
lis1—.99— .01. For means separated by an intervening mean, k — 3 
e level is 1 — (.99)? = .02. For means separated by two 
k= 4 and the significance level is (1 — ,99)3 = 03 


leve 
and the significane 


intervening means, 


234 


[18.4] 


18.5 


THE DESIGN OF EXPERIMENTS 


Thus the farther the means are apart in the rank ordering, the more lenient 
the standard of significance. The Duncan method requires a smaller 
studentized range for significance than does the Newman-Keuls method, 
except when k = 2. The method will yield more significant results than the 
Newman-Keuls method. The Duncan method requires the use of special 
tables. These tables are given by Edwards (1968), who provides a detailed 
description of this test. 

Another multiple-comparison method is due to Tukey and has been 
called the honestly significant difference method. It uses a single criterion 
value of the studentized range, regardless of the separation of means in the 
rank order. It is a nonsequential method. The criterion value is Qr, the 
value required for significance at the .05 or .01 levels for k and df, where k 
is the total number of means in the set and. as before, dfis the number of 
degrees of freedom associated with s,,2. This method will lead to fewer sig- 
nificant differences than either the Newman-Keuls or the Duncan method. 

All multiple-comparison procedures that use the studentized range are 
limited in their application because they apply only to groups composed of 
equal n’s. Bancroft (1968) has suggested an intuitive modification for 
dealing with unequal sample size. His proposal is to replace the п used in 
Sz = V 5’ [п by Пи, that is, sz = V5,7/f,, where ñ, is the harmonic mean of 
the number of observations in the groups, that is: 


е, k 
В (in) + (im) + > + nj) 


Clearly this method should not be used where t 


he n’s differ appreciably 
from each other. 


CONCLUDING OBSERVATIONS 


The problem of choosing a particular multiple-comparison procedure to 
apply to a particular set of experimental data has never been properly 
resolved. This choice is made by each investigator in the light of consider- 
ations which appear relevant to him. One consideration here is clearly the 
relative importance which attaches to Type I and Type II errors. The 


of the procedure is the reverse: D 


$ uncan, Newman-Keuls, Tuke: d 
Scheffé. Many investigators may wis ae 


h to compromise between Type I and 


4 2 т 4T 


EXERCISES 


MULTIPLE COMPARISONS 275 


Type II errors and may choose to use the Scheffé procedure with a signifi- 
cance level greater than .05 say. .10, or the Newman-Keuls procedure. 


А 


1 The following are means for five groups of experimental subjects: 


1 п ІШ IV ғ 
ии 
п 5 5 5 5 5 
X; 8.60 8.40 20.00 29.20 18.40 


The within-group mean square Sw” = 16.32, and the F ratio is 23.81. 
Test the significance of the difference between means for groups 
and У, using а ¢ test procedure. 

2 Compare means two at a time for the data of Exercise 1 above, using 
Scheffé's procedure. Indicate which comparisons are significant at thi 
.01 level. 1 

3 Apply the Newman-Keuls procedure to the data of Exercise 1 and 
indicate which comparisons are significant at the .01 level. А 


heffé's procedure to the data of Exercise 3, Chap. 


4 Apply Scl ù 
hich comparisons are significant at the .01 level. 


Indicate w 


кты чүү" a 


дас о «ра. . 


276 


19.1 


TREND ANALYSIS 


INTRODUCTION 


In experiments where the treatment, or independent, variable is nominal, 
the analysis of the data cannot be extended beyond an F test applied to the 
group means and the comparison of means either two at a time or in 
subgroups. In some experiments the treatment variable is of the interval or 
ratio type. Examples are experiments on the behavioral effects of different 
dosages of a drug, different Periods of practice in learning a task, or dif- 
ferent numbers of reinforcements in conditioning or extinction. With such 
d his analysis to an examination of 
the relation between the treatment 


this chapter 


provides answers to questions of this kind. It is an application of the analy- 


sis of variance. 


able, although such an equation could readily be obtained. Usually concern 
is with questions of significance. 


The procedures described in this chapter assume that the levels of the 
treatment variable are equally spaced. Equal spacing is not a necessary 


TREND ANALYSIS 277 


condition for the application of methods of trend analysis. It leads 
however, to simplification in computation. ү 


19.2 LINEAR TREND: PARTITIONING THE SUM OF SQUARES 


Consider an experiment in which k equally spaced treatments are adminis- 


tered 10 k groups, composed of п, ns, . - - Пк members. In applying an 


analysis of variance for one-way classification to such data, the total sum of 
squares is partitioned into two parts, a within-groups and a between-groups 
sum of squares. These are used in an F test of the hypothesis, Но: м: = из = 
- = д. The group means X5 ея X, may, however, show one 
уго either increase or decrease in a linear fashion with 
à in the treatment variable. Consequently, the inves- 
tigator may choose to concern himself with questions of linear trend 
A linear regression line may be fitted to the group means. This line m 
be used to predict the group means from values of the treatment variable. 
The mean for group j is Ху. Denote the corresponding value predicted front 
the regression line by Xj. The total sum of squares may now be partitioned 
into three independent parts. We begin by writing an identity 4 


systematic tendenc 
increase or decreas 


пәл Q3) = (Xu X9 + 057 HH) 0673) 


This ident 
grand mean Х may be viewed as composed of three parts: (1) a deviation of 
the score Ху from the mean of the group X; to which it belongs; (2) a devia- 
tion of X, from the value predicted from the linear regression line; and (3) а 
deviation of the predicted value Х| from the grand mean. This identity is 
squared, summed over the ny cases in the jth group, and summed over the Ё 
groups. The cross-product terms vanish, and we obtain 


Ww к п = 

[19.2] > У (.- =D У, Xu- X9* 
ігіісі еріні к . 
LLL Ў ua - X? 


зі 
e total sum of squares is partitioned into three parts. The first part | 
of squares, and has associated with it N — kdegrees 
d part is a sum of squares of deviations of array 
from linear regression. If the group means fall exactly on a straight 
squares is zero. It has associated with it А — 2 degrees of 
ird part is a sum of squares of deviations of the predicted 
d mean. It has associated with it one degree of - 
losely related to the slope bz, of the 


Thus th 
is a within-groups sum 
of freedom. The secon 
means 
line, this sum of 
freedom. The th 
values, Xj, from the gran r 
freedom. This sum of squares is € 
regression line. It may be shown that 


k т 2.2 
D» АРА 


=! 


ity states that the deviation of a particular score X; from the 


< 


я 


278 


19.3 


THE DESIGN OF EXPERIMEN’ 


5 


where 5,2 is the variance of the treatment variable. For a particular experi- 
ment, the quantity Ns,? exhibits no freedom of variation. It may be viewed 
as fixed. Variation in the sum of squares depends, therefore, on variation in 
one quantity only, 6,,, the slope of the line. Consequently, the sum of 
squares has associated with it one degree of freedom. 

The three sums of squares are divided by their associated degrees of 
freedom. Three variance estimates. or mean squares, are obtained. These 
аге 8,7, the within-groups mean square; sa’, the mean square for deviations 
from the linear regression line: and 52, the linear regression mean square. 
Two F ratios may be obtained. The first is Е, = s;7/sy2, with one degree of 
freedom associated with the numerator and N — k degrees of freedom 
associated with the denominator. This provides a test of linear trend. It 
tests whether the slope of the regression line is significantly different from 
zero. The second F ratio is Р, = sess? with k —2 degrees of freedom 
associated with the numerator and N — k with the denominator. This 
provides a test of departures from linearity. It tests whether the variance of 
deviations from linearity is significantly different from zero. If F, is signifi- 
cant, we may conclude that a straight regression line is not a good fit to the 
set of group means. 


COMPUTATION FORMULAS FOR LINEAR TREND 


The calculation involved in the analysis of data for trend with equally 
spaced treatment levels may be simplified by using a computation variable. 
In the process of calculation the values of the computation variable are 
assigned to different levels of the treatment variable, For = 3tok=7 the 
computation variables, denoted by суу are as follows: 


k 

Зы 1 0 1 

3657 ПИЕ: 

b = 51 OL. T о 

Sms cs win. 4 3. 5 

TOME Bie 474405 3 


calculating a mean or standard deviati 
The computation formulas for the total and within-groups sums of 


squares are the same as for one-way classification, 


and are given in 
Chap. 15. 5 


TREND ANALYSIS 279 


қ 5 
For unequal n’s the sum of squares for linear regression is given by 


k k E 
k р (> аЙ)-ТУ тау! ) 
n94] Y (Xj — Xy гі — 


к z 

зі м 
x тай- (5 п) IN 
p у=! 


The deviation sum of squares may be obtained by substracting the within 
and linear regression sums of squares from the total sum of squares. | 
For equal n’s the sum of squares for linear regression reduces to j 

Ё 


k 2 
(5 о) 
j=1 ! 


k 
PES = 
[19.5] л X (x; - X) sa? 


19.4 ILLUSTRATIVE EXAMPLE OF 
ANALYSIS FOR LINEAR TREND 1 


The computation required for an analysis of variance for linear trend i: 
illustrated in Table 19.1. The data are those used previously in Table 15.2. | 
The total sum of squares is 168.15, as before, and the within-groups sumta 1 


85.93. The linear regression sum of squares is 


4 
b 
7 


k el 2 
(x un- TE пы" ) Ем (6 x4 261 
138 — 16/26 = 19.34 


d k 2 к 
У ње? - (5 neu) IN P 
j=l vy 


ӛзі 


Table 19.1 Computation for linear trend analysis with unequal n's using the data of 


Table 15.2 у 
Method N=26 m. 
Т= 146 | 
М 
1 2 Ы 4 Тү = 819.85 — 
E к 
p 4 
X, 5.38 8.40 6.14 3.00 k E 
s в 5 Do Жан 
at 
Т, 43 42 43 18 У cul =—74 | 
4-1 
к 
=) 25 zs 1 3 Y nof = 138 Я 
M 41 
в) kon а 
»x; 269 364 287 68 У Y Х- 988 
ізі j=1 ізі 
ту зыз 35280 214 — 5400 
ny 


230 


Table 19.2 


19.5 


[19.6] 


THE DESIGN OF EXPERIMENTS 


Linear trend analysis for the data of Table 19.1 


Source of Sum of Degrees of Variance 
variation squares freedom estimate 
Linear 

regression 19.34 1 55° = 19.34 
Deviation 62.88 2 54° = 31.44 
Within 85.93 22 52 = 3.91 
Total 168.15 25 
Е 0 

‚ 19.34 31.44 

LIU -4.95 = =8.04 
5 mpl Fe 3.91 : 


The deviation sum of squares, obtained by subtraction, is 168.15 — 85.93 — 
19.34 — 62.88. The F ratio for linear regression is significant at a little 
better than the .05 level. The F ratio for deviations from linear regression is 
significant at better than the .01 level. If the .05 level is accepted as a suit- 
able level of significance, the conclusior 
significant linear trend. Because of 
however, the conclusion is that a straigh 
the data. The relation between the tre. 
ables may be represented more appropriately by 
line. 


POLYNOMIAL REGRESSION. 


The procedures described thus far deal wi 


X'S at bY + EYE ssp у 


In this equation a is the point where the curve intercepts the X axis, бү. by. 
+ + + ; бы are regression coefficients or weights, and 0 ga a 
the powers of Y. Note that the equation 
instance of this equation which occurs whe; 
vanish, A straight line is a polynomial of the 
right of bY? vanish, a second-degree polynomial, or quadratic equation, 
results. Similarly, higher-order polynomials, cubic equations, quartic equa- 
tions, and the like may be considered. The method of calculating the coeffi- 
cients 65, 2... bm is quite simple, although the details of it need not 
concern us here. The coefficients may be obtained by the method of least 
Squares using the multiple regression method described in Chap. 26 of this 


7 


19.6 


[19.7] 


TREND ANALYSIS 281 


book. The powers of Y, that is, Y, Y*, . . . , Y", are used as the indepe 
‹ * n- 
dent variables, and бу, bs, . . . , b, are multiple regression eight i 


ORTHOGONAL POLYNOMIALS 

Of particular interest in the analysis of data for trend is a particular type of 
polynomial known as an orthogonal polynomial. In the polynomial equation 
of formula [19.6] Y, Y, . - -> Y" are not independent of each other. 
Consequently it is not possible to partition a sum of squares on X' into 3 
number of independent components each associated in turn with Y, Y? 
|.. SY". A polynomial equation of the type shown in formula [19 6] баш 
be expressed, ог represented, in a form such that the successive terms are, in 
fact, independent or uncorrelated. Such an equation may be ийтеп 52 


Х| = а + by + 03 + stb асту 


In this equation the b's are regression coefficients, and the c’s are orthog- 


onal sets of coefficients. 
To illustrate the nature of the 


sets of numbers 


c’s in the above equation, consider the two 


-1 0 1 
=2 1 


Cy 
Coj 1 


= Ley = 0; also Ec;jcs; = 0. Because the sum of products is 
are said to be orthogonal. They are independent of each 
Jated. Again consider the following: 


Note that Eci; 
zero, c,; and Сг) 
other, or uncorre 


Cy m =} 1 g 
Coj 1 =I m 1 
Caj “1 8 = ! 


Неге Хсу = св = Хсу= 0. Also, Усџусз = Усс) = Хсьсз = 0. Тһе 
three sets of numbers are spoken of as coefficients for orthogonal polyno- 
mials. The set су is the set of linear coefficients; сә), the quadratic; and cs; 
the cubic. The number of times the signs change determines the degree of 
the polynomial. For the coefficients above, one sign change occurs in the 
linear set, two in the quadratic, and three in the cubic. 

A description of the procedure for obtaining sets of orthogonal coefh- 
cients is beyond the scope of the present discussion. The procedure is | 
described in Kendall (1943) and elsewhere. Table J of the Appendix shows 
a short table of coefficients for orthogonal polynomials. А more extended 


table will be found in Fisher and Yates (1963). 


282 


19.7 


[19.8] 


[19.9] 


[19.10] 


THE DESIGN OF EXPERIME? 


PARTITIONING A SUM OF SQUARES 

USING ORTHOGONAL POLYNOMIALS 

In the discussion on linear regression in Sec. 19.2 the values of Xj of equa- 
tion [19.1] were obtained by using a linear regression equation. In polyno- 
mial regression the values of X} may be obtained by using a polynomial 
regression equation as shown in formula [19.6]. With orthogonal polyno- 
mials the values of X; are obtained by using an orthogonal polynomial as 
shown in formula [19.7]. Because for equal n's a — X, we may write 


(Xj — X) Sb + boca +--+ bmCmj 
By substitution in equation [19.1] 
(Xu — X) = (G7 X) + (X, — X) + су + bayt © с 


This equation is squared, summed over the z cases in the jth group, the n’s 
for the k groups being considered equal, and summed over the 4 groups. 
The cross-product terms vanish, and we obtain 


kon X К on am k TE d 
У У (о-у у (= X)* +2 > (X, — Xj)? + nb У а) 
ј=1 ј=1 


Ӛзі ied ізі ісі 


je 


к k 
+ nb? > CoP ee nb, b Cm? 
= 


quadratic, cubic, quartic, and so оп. 

The within-group sum of squares has k(n — ] 
deviation sum of squares has А — 2 de 
tion, k— 3 for a quadratic equation, 


‚ 4, B, C, D, Е. Each experiment | 
treatments, I, II, Ш, IV, V. Let the ari Я пае 


ments be as follows: 


тосу 
ыз 
© 
ы 
© 
os 
> 
© 
® 
8 


[19.11] 


[19.12] 


19.9 


TREND ANALYSIS 283 


For experiment A all means fall exactly on a straight line. A linear regr 

sion component will be obtained, but no higher-order or deviation E. 
nents. For experiment B the means increase and then decrease. A E 
dratic regression component will be obtained, but no linear componens 
because the means show no overall tendency to either increase or 
decrease in a linear fashion. For experiment C the means show an overall 
tendency to increase, but not in аЧіпеаг fashion. А decreasing increment is 
observed from one mean to the next. Here a linear regression component 
o the over-all increase in means will result; a quadratic 
o result, which will reflect, as it were, the bending 
n. Experiment D illustrates a cubic relation. A linear 


corresponding t 
component will als 
nature of the relatio 


comp 
means to increase. А quadratic component will reflect the asymmetrical 


nature of the cubic relation. Experiment £ illustrates a quartic relation. For 
this experiment the linear, quadratic, and cubic components will be zero 


UTATION FORMULAS FOR TREND ANALYSIS 

USING ORTHOGONAL POLYNOMIALS 

The computation formulas for the total and within-groups sum of squares 

are the same as for one-way classification, and are given in Chap. 15 
For equal n’s the various regression components are given by ; 


COMP 


nb; 2 ІЗ 
ii n Y суў 
ja 
and so on. 


ILLUSTRATIVE EXAMPLE OF TREND ANALYSIS 
USING ORTHOGONAL POLYNOMIALS 


Table 19.3 show: 


trend analysis using orthogonal polynomials. In this example n = 8, the 


groups being of equal size, and k — 4. Table 19.4 shows the corresponding 
analysis of variance table. In this example the linear component is signifi- 
cant with p < .01. Neither the quadratic nor the cubic regression compo- 
nents are significant. The reader will note that the deviation component їз 
zero. This occurs because a cubic equation will always fit four points 


onent will also be obtained which reflects the over-all tendency of the - 


s an illustrative example of the computation required fora | 


284 THE DESIGN OF EXPERIMENTS 


Table 19.3 Illustrative example of computation for trend analysis using orthogonal 
polynomials 


2. Treatment 
I п ІШ IV 
cs a >... 
5 4 5 14 
7 12 8 1 Т= 294 
6 8 17 6 ТУМ = 2,701.13 
3 7 26 12 У Тр/п = 2,995.75 
ae 
9 7 14 15 У У х2 = 3,524 
ізі ізі 
к 
1 6 8 19 > cT, = 192 
j=1 
k 
4 4 13 8 У oT,-—2 
ӛзі 
2 9 11 п 
п 8 8 8 8 а 
cT, = —86 
т, 43 57 102 92 дөп 
к 
X, 5.38 7.13 12.75 11.50 У а= 20 
Cy =3 —1 1 3 iat 
Coy 1 —1 =I 1 SS 
ср —4 
сз, -1 +3 =з 1 2 S 
п к 
Yu; 269 455 1,604 1,196 Ха)-20 
ізі = 
Sum of squares 
Linear 192*/(8 x 20) = 230.40 
Quadratic —24*/(8 x 4) = 18.00 
Cubic —86'/(8 x 20) = 46.23 
Deviation 822.87 — 528.24 — 230.40 
—18.00—46.23— 00 
Within 3,524.00 — 2,995.76 — 528.24 
Total 3,524 — 2,701.13 


Table 19.4 


19.10 


TREND ANALYSIS 285 


Analysis of variance using trend analysis for data of Table 19.3 


Source of Sum of р i 
Saarion eee d е ss 
Linear regression 230.40 1 512 230.40 
Quadratic "74 
regression 18.00 1 52- 18.00 
Cubic regression 46.23 1 s= E 
Deviation :00 I 
Within 528.24 28 5.2 = 18.87 
Total 822.87 31 
Ee 
230.40 18.00 46.23 
F- = 12.21 F9 -—— = 
1770887 pa er 


exactly, just as a linear equation will always fit two points exactly, and a 
quadratic equation three points. In this example with А = 4 the analysis. ) 


has been carried as far as possible. 


TREND ANALYSIS WITH UNEQUAL r's 
y is under study, whether linear, quadratic, 
cubic, or of a higher order, the n's may be equal or unequal. If the - 
nt of the investigator is to examine only the linear and quadratic com- 
r-order components, the experiment should be 
are either equal or constitute a symmetrical set, 
0, 4, for k = 3, or 10, 20, 20, 10, for k — 4. The 
nents are orthogonal when Zn;c;;cs; = 0. For 
th unequal n’s the linear regression sum 
> (с\)Т,)°/®пусуў, and the quadratic by 


Ifa single trend component onl 


inte 
ponents and no highe 
designed such that the n’s 
such as, for example, 4, 1 
linear and quadratic compo 
orthogonal components wi 
of squares is given by 
E(csT))*[En;csf- 

For higher-order tren 
such that the n’s are equal 
components, b,c,;, басар ап 


d analysis it is advisable to design the experiment 
1. The reason for this is that the various score 
d the like, in the equation for an orthogonal 
polynomial, are orthogonal for equal п’з. For unequal n’s the components 
will not be orthogonal, except under rather special circumstances. The first 
three components form an orthogonal set when 27j¢1;C2; = ®пусуусуСзу 
= Улусьзс = 0- Clearly for unequal n’s this set of components will not or- 
dinarily be orthogonal. 

In practical experimental work, situations arise where the n’s are 
unequal. If the departures from equality are not gross, methods may Бе . 
used identical to those described in Chap. 16 for making adjustments for 
unequal n’s in the analysis of variance for two-way classification. 


236 


19.11 


19.12 


THE DESIGN OF EXPERIMENTS 
TREND ANALYSIS: CORRELATED DATA 


The methods of trend analysis described above have application to experi- 
mental data obtained from А independent groups. These methods may be 
extended to correlated data with measurements obtained on a group of п 
subjects under А conditions. Such data are illustrated in Table 16.7, Sec. 
16.13. The columns represent four conditions and the rows 10 subjects. 
Variance estimates are obtained for rows, columns, and interaction. The 
difference between column means is tested using F = 5,2/52, the denomina- 
tor being the interaction variance estimate. 

If the differences between the experimental conditions are equally 


independent groups the within-groups variance estimate, 5,7, is used as the 
error term to test the regression and deviation components, whereas for 
correlated data the interaction sum of squares s; is used in th 
tor of the F ratio, 

Thus the F test for linear regression is F = sj?/s? with df, = 1 and df, = 
(R—1)(C—1) = (a— 1) (k — 1). The test for quadratic regression is F — 
522/52, and so on. The test for the deviation component is F = 42152 with 
df,=C—m—1,m being the number of re, 
and df, = (N—1)(k— 122 


е denomina- 


gression components removed, 


is chapter have application to 
The methods may be applied 
ments. With a two-way facto- 
ap. 16, both row and column 


N TREND ANALYSIS 287 


X, 7.40 10.50 12.65 
п ы 
SXF 2456 2.150 2,760 E 
ізі $ 
BT. NM 222 3 
3 


Calculate (а) the within-group, linear regression, and deviation sums of - 
squares, (b) the three variance estimates, (c) F ratios for testing the sig- 4 
nificance of linear regression and the deviations from linear regression. | 
я 
2 Obtain the slope of the linear regression line for the data of Exercise 1 | 
above, assuming a unit difference between the levels of the treatment Е 
variable. 
3 What are the coefficients for orthogonal polynomials сі), Сә), ca; for | 
k=6? i 


4 Consider the following data: í 


( 


І п ш IV [4 
стана балала SE а ее 
п 10 10 10 10 10 
3 5.50 8.65 10.43 4.86 9.50 
xx; 305 875 1050 305 905 


Calculate the linear regression, quadratic regression, cubic regression, 
deviation, and within-groups variance estimates and apply the appro- | 


priate tests of significance. n 
ach to the analysis of experimental data for. 


5 What difficulties will att 
her-order trend, when the n's in the k groups. 


quadratic, cubic, and hig! 
are unequal? 


288 


20.1 


ANALYSIS OF COVARIANCE 


INTRODUCTION 


One object of experimental design is to ensure that the results observed 
may be attributed within limits of error to the treatment variable and to no 
other causal circumstance. For example, the assignment of subjects to 
groups at random and the matching of subjects are experimental proce- 
dures the purpose of which is to ensure freedom from bias. Situations 


“adjust for” the effects of one or more unc 
thereby, a valid evaluation of the outcome 
of covariance is such a method. 

To illustrate, an investigator may wish to compare three different 
methods of learning French. Each method is applied to a different group of 


subjects. Mean scores on a test of French achievement, following a period 
of instruction, are obtained for the three 


differ in intelligence, and 
achievement. Thus the i 
differences in French achievement result fr 


an uncontrolled variable. If measures of 


intelligence are available, the analysis of covariance may be used to 


20.2 


ANALYSIS OF COVARIANCE 289 


compare the differences in French achievement between classes, with the 
influence of intelligence, as it were, statistically controlled. 

Consider another example. The effect of two drugs on motor perform- 
ance is under study. Two groups of subjects are tested іп an initial predrug | 
condition and under the influence of one of the drugs. The initial level of 
motor performance for the two groups may be different. Initial level is an 
uncontrolled variable. Part of the differences in motor performance under 
the drug condition may be due to differences in initial level. The analysis of | 
covariance may be used to remove the bias introduced by differences in 
initial level and permit the making of unbiased comparisons between drug 
effects. The analysis of covariance is quite commonly used in drug studies 
in just this way. 

In psychology and education primary interest in the analysis of 
covariance rests in its use as a procedure for the statistical control of an 
uncontrolled variable. It may, however, serve other purposes, such as 
testing the homogeneity of a set of regression coefficients and related 
hypotheses. 

In applications of the analysis of covariance the influence of the uncon- 
trolled variable, sometimes called the covariate or the concomitant vari- · 
able, is usually removed by a simple linear regression method, and the 
residual sums of squares are used to provide variance estimates which in 
turn are used to make tests of significance. The reader is advised at this 
time to review his knowledge of simple linear regression. 


NOTATION 

An application of a simple analysis of covariance requires paired observa- 
tions on k groups of experimental subjects. The number of pairs of observa- 
tions in the k groups is denoted by пл, п... » Pi- The paired observa- 
tions are assumed to be paired samples drawn from k populations. The 
data may be represented as follows: 


Group 1 Group 2 Group k 
n 
Үп Xn Үз Ха Үл Хк 
Ya Ха Үз Ха Ук Xok 
Үн Хи Үн Xa Yu Хақ 
Ymi Жан Yus Жыз Yne Хак 
Means X; Ж У. X Y, X. 


e variable under study, the dependent variable, 


In this notation X is th М 
or covariate. In the examples of the 


whereas Y is the uncontrolled variable, 


- 


20.3 


[20.1] 


THE DESIGN OF EXPERIMEN 5 


previous section, either Х is a measure of French achievement and Y is a 
measure of intelligence, or X is a measure of motor performance under the 
influence of a drug and Y is a corresponding measure obtained under the 
initial predrug condition. The analysis of covariance enables a comparison 
of the group means of X adjusted for differences in the means of Y. The 
group means on Y and X are represented By Yes ата Y, and X,, 
Row Б Xs А dot notation Yu Y... and so on, where the dot represents 
the variable subscript, would perhaps be more appropriate. For conve- 
nience, however, we have chosen to use a notation without the dot. The 
grand means of Y and X are Y and X, respectively. 

In the analysis of covariance, sums of products are considered. The sum 
of products for the observations in the ЛЬ group is denoted by 


5 (X, — X) (Уу — Y) 
i=1 


The sum of products for all observations in the 4 groups, that is, the total 
sum of products, is 


k n 


2 (X5 — X) (Y — Y) 


ізі ізі 


PARTITIONING A SUM OF PRODUCTS 
Before proceeding further with the dis 
with paired observations on Ё groups th 
titioned into within-groups and between-groups sums of products in a 
manner analogous to that in which a total sum of squares is partitioned into 
within-groups and between-groups sums of Squares, As in the analysis of 
variance for one-way classification we may write 

(Xy — X) = (X — X) + (X, -Х) 

(Ys —Y) = (Y — Y) + (Y,— Y) 


cussion, it is useful to observe that 
€ total sum of products may be par- 


These are multiplied, summed ov. 
summed over А groups. When this 
right vanish in the summation pro 


er the n; cases in the Jth group, and 
is done, two cross-product terms to the 
cess, and we obtain 


коп, % к сол, 
22 -30,--$ (РР) 
кіні 


іні ігі 
ke > = e = 
+ > пХ,— ¥)(¥,—¥) 
ӛзі 


Тһе term to the left is a total sum of products of deviations about the grand 
means X and У. The first term to the right is the sum of products of devia- 
tions about the group means. It is the within-groups sum of products, The 
second term to the right is the sum of products of deviations between 
groups. It is the between-groups sum of products, 


E 


ANALYSIS OF COVARIANCE 291 , 


20.4 REGRESSION LINES 


[20.2] 


[20.3] 


[20.4] 


[20.5] 


With data consisting of paired observations for А groups, a number of dif- 
ferent regression lines may be identified. The slope of the regression line 
used in predicting X from a knowledge of Y, as given in Chap. 8, is 


$ oa-30-Y 


DNE 


*. A 0T 
ist 


bry = г 


This slope is obtained by dividing а sum of products by a sum of squares. If 
we divide the total within-groups and between-groups sums of products by 
the corresponding sums of squares, three regression coefficients are 
obtained. These are the slopes of three different regression lines. 

The first is the total over-all regression line for predicting X from a 
knowledge of Y based on all the observations put together. The slope of this 
line is 

kom т - 
Ў Ў u-u- 
b j=1 ia = z 
Sgum 


ізі ізі 


А second regression line is the over-all within-groups regression line. In 3 


predicting X from a knowledge of Y we may consider each of the k groups 
separately. Each group has its own within-group regression line with slope 
bj. Information from these Ё separate regression lines may be pooled to 
obtain an over-all within-groups regression line whose slope is given by 
k m a = 
У > (X4 — X) (У„— Y) 

LE ee 
Y XO Y? 
jai iei 
The numerator of this equation is the within-groups sum of products, and 
the denominator is the within-groups sum of squares for Y. It should be 


noted here 


used to obtain bw involves an assumption of homogeneity of. 


pooling process 
. » = Ву. This is a basic assumption in the 


slope, that is, that B, = В = · 


analysis of covariance. 
A third regression line may be considered with slope by obtained by 


dividing the between-groups sum of products by the between-groups sum 


of squares for Y. The slope of this line is 


Š 0,7 3)0,-Y) 
b= ae ST 
> nj(Y;— ү)? 


ј=1 


that the slopes of the individual group regression lines, by, boy 
.. , bp are estimates of population parameters fi, Bo, . . . . Br The | 


[20.6] 


20.5 


[20.7] 


THE DESIGN OF EXPERIMENTS 


Of particular interest in the analysis of covariance is the within-groups 
regression equation. This equation has the form 


у = (У, У) +X; 


where bw is the within-groups regression slope. As will be shown, the analy- 
sis of covariance makes use of the sum of squares of residuals of X about 
this regression line. 


ADJUSTING THE SUM OF SQUARES OF X 


Given measurements on У and X for k groups the total sum of squares for 
both У and X may be partitioned into within-groups and between-groups 
sums of squares, using the analysis of variance methods described in 
Chap. 15. Also the total sum of products may be partitioned into 
within-groups and between-groups sums of products, as described in Sec. 
20.3. How may the sum of squares on X be adjusted to allow for, or to 
remove, the influence of the variation of the uncontrolled variable У? 

Let us first consider, in general, the problem of calculating a sum of 
squares of residuals of X about the linear regression line used in predicting 
X from a knowledge of Y. The equation for this linear regression line may 
be written X; = b;, (Y, — Y) + X, where X; is a predicted value, and bry is 
the slope of the line. The sum of squares of residuals about this line is 


N 
У 0G — Xi). By substituting 6,, (Y; Ӯ) + X for X: in this sum of 
i=1 


squares and using simple algebra 


» it is readily shown that the sum of 
squares of residuals is 


N 


X ох $ ta - 3) - uo, - np 


ізі ісі " a R 
и [5 @- Xyty,— »] 
> і Х) Е 7 
Š r-r} 
izi 


proceed by calculating an adjusted total 
ession line with slope b, based on all the 
ut together. This adjusted total sum of 


sum of squares on X, using a regr 
observations for the Ё groups p 
squares is given by 


[20.8] 


[20.9] 


20.6 


ANALYSIS OF COVARIANCE 293 


k 


kom A 
2 > (Xi хуу У [(Х„— X) — Бү, — Y)]? 
К m 
[55 a-r] 


kon 
=>} У (Xi Xy je del 
jai i=1 EO = 
> (Ky —E)* 


ізі ізі 


T 4 5 
һе next Step is to calculate an adjusted within-groups sum of squares 
using the within-group regression line with slope bw. This sum of squares 

2 q is 


given by 


kon kom 
УУ б» у= У У, [(Х„— Xj) — bY — ¥5) J? 


ігі ізі 


jai iei 
kom 
ton > (X, — X9, — Y, à 
MPT ош 
j=1 ізі п 
PP» 


e an adjusted within-groups sum of squares on X, we sub. 
the m of squares on X a quantity which is и E 
to the within-groups sum of products, squared, divided b n 
within-groups sum of squares on Y. The adjusted sum of squa a 
between groups is now obtained by subtracting the adjusted owes 


sum of squares on X from the adjusted total sum of squares 


Thus to calculat 
tract from the within-groups su 


DEGREES OF FREEDOM AND VARIANCE ESTIMATES 


The numbers of degrees of freedom associated with the unadjusted and 


adjusted sums of squares on X are as follows: 
E 


Unadjusted X Adjusted X 


О оны 


Between k=] ' 
Within N-k МЕ 2 
N-1 N-2 


Total 


eedom associated with the adjusted total sum 


The number of degrees of fr 
of squares on XisN—2- This sum of squares consists of squared residuals 
regression line, and as such has N — 2, and not N— 1 
E 


about a linear 
degrees of free 
process. The numbe 


] degree of freedom is lost in the adjustment 


dom. Thus 
f freedom associated with the adjusted 


r of degrees o 


294 


20.7 


[20.10] 


[20.11] Ў Y Unc s Т.Т, 
. У (Ху- 5) (У, Р) = Tay — у ==“ 


THE DESIGN OF EXPERIMENTS 


within-groups sum of squares is № — k— 1. The number of degrees of 
freedom associated with the adjusted between-groups sum of squares is 
k — 1, and is unchanged because the between-groups regression line did 
not enter into the calculation of the between-groups sum of squares, this 
sum of squares being obtained by subtraction. 

The adjusted sums of squares on X are now divided by their associated 
degrees of freedom to obtain within-groups and between-groups variance 
estimates s,? and 5,2, The interpretation of these variance estimates is the 
same as in the analysis of variance, except that the null hypothesis under 
test relates to adjusted treatment means, that is, means that are free of the 
linear effect of the covariate. 

To test the significance of the difference between the adjusted means 
of X, an F ratio sj?/s,? is obtained. This ratio is interpreted with df= k — 1 
associated with the numerator and df= N — k — 1 associated with the 
denominator. 


COMPUTATION FORMULAS 


Computation formulas for sums of squares for X and Y are given in Sec. 
15.6. To simplify the notation for the computation formulas for sums of 
products, denote the sums of all the observations in the jth group for X and 
Y by Т, and T, , respectively. Thus 


>Ху-7, Š Ys=T, 
i=1 i=1 


Denote the sums of X and Y for all the observ. 


ations together in the k groups 
by Т, and T,. Thus 


k 


Sten ии, 


Я ja dei 

The sum of products for the jth group may be represented by 
п, 

È Хөй» = Tey 


i=1 


and the sum of products for all observations in the А groups by 
k 


№ Y ХУ, = Т,, 


Ӛзі ізі 


The computation formula for the total sum of products is 


S Gv-3X0-Y)- T,, — Fel 


1 і-і 


м» 


i 


ї 


The within-groups sums of products may be obtained by 
к 
кірі 


жа № 


E 


[20.12] 


[20.13] 


20.8 


20.9 


ANALYSIS OF COVARIANCE 295 — 


The between-groups sums of products is 


Т.Т, _ Т.Т, 
п; N 


y. 


UR OO ee а | 


Y«-30-p-3 


The above formulas are applicable to groups of unequal or equal size. In. 


the particular case where n, = л» =. - + =n we may, of course, write the | 
term b 
$ 
k n 2% 
5 Т.Т, p ss 
dep N Т p. 
SUMMARY % 


Іп summary, to test the significance of the difference between А adjusted ) 
means on X using the analysis of covariance, the following steps are ' 
involved: A; 


1 Partition the total sum of squares on both Y and X into two compo- 
nents, a within-groups and a between-groups sum of squares, using the 
usual analysis of variance formulas. 

2 Partition the total sum of products into two components, a 
within-groups and a between-groups sums of products. 


3 Calculate an adjusted total sum of squares on Х to remove the linear 1 
effects of the covariate У. 


4 Calculate an adjusted within-groups sum of squares on X using the 
within-groups regression of X on У. 


5 Calculate an adjusted between-groups sum of squares by subtraction; 
that is, subtract the adjusted within-groups sum of squares from the | 
adjusted total sum of squares. А 

6 Obtain the variance estimates Sw? and 55’ by dividing the adjusted 
within-groups sum of squares on X by df= N— Е- 1, and the 
between-groups by df= k — 1. 

1 Test the significance of the adjusted means on X by referring F= | 
55,210 a table of F. 


ILLUSTRATIVE EXAMPLE: ANALYSIS OF COVARIANCE 


s an artificial example illustrating the analysis of 
n two variables for k —3 with un- 0-2 


= 11.70, X, = 8.75, and X, = 6.17. 
the means for the covariate Y 


Table 20.1 show 
covariance. Information is available o 


equal n's. The treatment means are Жі 
Although these means differ appreciably, е: 
also differ, these being У, = 17.60, Y, = 9.25, and Y; = 6.83. Ifa correlation 


296 
- Table 20.1 
three groups 


THE DESIGN OF EXPERIMENTS 


Computation for the analysis of covariance: paired observations on Y and X for 


Group 
г П ІП 
ағаны Чьи 2... 
*Y X y x Y x | N=24 
5 5 6 5 4 T |T- 
ll 6 6 5 8 | Y=12.13 
12 9 7 12 7 3 Т/М = 3,528.38 
È у 
26 12 12 10 9 4 У Dy Y,?=4,777 
28 15 14 10 10 = 
24 16 i 6 5 |5 D 40627 
1 18 15 rm 
27 20 12 Т, = 224 
19" 4 ТМ = 2,090.67 
1 kon 
20 12 БУ У Х,? = 2,618 
j-d ded 
т 10 s $ | X Is 9 909.57 
TUIS, 16 17 14 а зањ 7 
ух 17.60 11.70 9.25 8.15 6.83 6.17| T,,— 3,248 
ҮР XP 3,720 1,651 750 307 263 5 ТУТ — 2 989.53 
ігі ізі mou 
Toy, 2,341 652 255 D — 2,716.00 
Ty? Т.А 
eC 3,097.60 684.50 280.17 
1,368.90 612.50 228.17 
Sum of squares 
Y X 
Between 4,062.27 — 3,528.38 = 533.89 2,209.57 — 2,090.67 = 118.90 
Within 4,777.00 — 4,062.27 = 714.73 2,618.00 — 2,209.57 = 408.43 
Total 4,771.00 — 3,528.38 = 1,248.62 2,618.00 — 2,090.67 = 527.33 
Sum of products 
Between 2,959.53 — 2,716.00 = 243.53 
Within 3,248.00 — 2,959.53 = 288.47 
Total 3,248.00 — 2,716.00 = 532.00 


Е Е В -- 


= 


Table 20.2 


ANALYSIS OF COVARIANCE 297 


of some magnitude exists between Y and X, we might anticipate that а sub- 
stantial part of the variation in the X means will result from the differences 
in the Y means. All terms necessary for the direct calculation of sums ой 
squares оп X and У and sums of products are given in Table 20.1. 

Table 20.2 summarizes the analysis of covariance. The adjusted total 
sum of squares for X is 527.33 — 532.00?/1,248.62 = 300.66. The adjusted 
within-group sum of squares is 408.43 — 288.472/714.73 = 292.00. Тһе 
adjusted between-groups sum of squares is 300.66 — 292.00 — 8.66. The 
variance estimates are 52 = 4.33 and s,? = 14.60, and F = 4.33/14.60 = .30. 
This ratio is not significant, and, indeed, it falls substantially short of the 
value of F of unity expected under the null hypothesis. Quite clearly almost 
all the variation in the X means can be attributed to the influence of the 


uncontrolled variable У. 
It is of interest here to calculate directly the adjusted means on X. These 


adjusted values are given by 
Ху'=„(Ү—Ү, +X; 
In this sample б, = 288.47/714.73 = .404. The three adjusted means are 


Xi' = .404(12.13 — 17.60) + 11.70 = 9.49 
Xj’ = .404(12.13 — 9.25) + 8.75 = 9.91 
Xi! = .404(12.13 — 6.83) + 6.17 = 8.31 


These adjusted means vary very little one from another, a fact which is 
clearly reflected in the small F ratio. We may safely conclude that the dif- 
ferences between the unadjusted means for X are due largely to the effects 


of Y. 


Analysis of covariance for data of Table 20.1 


Source of variation 
RENE Var uL 


Between Within Total 
етте екн oe 
Sum of squares: У 533.89 714.73 1,248.62 
Sum of squares: X 118.90 408.43 521.33 
Sum of products 243.52 288.47 532.00 
Degrees of freedom 2 21 23 
Adjusted sum of squares: x 8.66 292.00 300.66 
Degrees of freedom for 

adjusted sum of squares 2 20 22-7 

56? = 4.33 Su? = 14.60 


Variance estimates 


F = 4.33/14.60 = .30 р> 05 


298 


20.10 


[20.14] 


[20.15] 


THE DESIGN OF EXPERIMENTS 
HOMOGENEITY OF REGRESSION COEFFICIENTS 


For certain purposes it may be a matter of interest to test the hypothesis 
that the slopes of the regression lines within the / groups are the same; that 
is, Ны Ві = В. = · - - = Bk = By. This hypothesis is assumed to be true in 
any application of the analysis of covariance. 

To test this hypothesis, we require that the sum of squares for У and X 


and the sum of products for each of the Ё treatments be given separately. 
These values are given by 


У (Xy — X) = У Xj E 
ізі ігі j 
п, ты 2800-22 me 
> (y-Yy-Yy gg 
іі ici т 


п, E m п, Т. 
Ys- X) 0t, - X) = Ў хуу, — Тан. 
ізі 


11 ni 


The next step is to calculate separately for each group an adjusted sum 
of squares on X in the manner previously described. This is done by sub- 
tracting from the sum of squares for X a quantity equal to the square of the 
sum of products divided by the sum of squares for Y. 


These adjusted sums 
of squares are summed over the А groups to obtain 


k в, (% (Yu P Yj)(X,, x ху] 
У У, Gu - X) = 4 =. 
а Қы) Y (0 Ӯ)? 


For convenience we note this term by 4. This term is а sum of squares of 
residuals about the k individual regression lines. It has associated with it 


k k 
У, (4-2) = У п – 2k =N— 2k degrees of freedom. The number of 


To test for homogeneity of regression the following F ratio is calculated: 

ғ--В-ЖФ(%-1) 
AI(N — 2k) 

with k — 1 and № — 2k degrees of 
and denominator, respectively, 

The general rationale underlying the above F 
B» = + - = В, is true, the sum of squares of re. 
group regression lines with slope b; will be th 
sampling error, as the sum of squares о 


freedom associated with the numerator 


ratio is that when Но:В, = 
siduals about the individual 
€ same, within the limits of 
f residuals about a single 


Оо 


Table 20.3 


20.11 


ANALYSIS OF COVARIANCE 299 


Computation for test of homogeneity of regression coefficients using data of 
Table 20.1 


Sum of squares Adjusted 
Sum of sum of 
Group Y X products squares for X 
1 622.40 282.10 281.80 154.51 
2 63.50 91.50 4.50 91.19 
3 26.83 34.83 2.17 34.65 
Total 714.73 408.43 288.47 280.35 = 4 


eee SS ЕАН Е жш; 
A = 280.35 В = 408.43 — 288.472/714.73 = 292.00 
ек (292.00 — 280.35)/(3—1) | 5.83 _ 37 

280.35/(24 — 6) 15:58! * 


within-groups regression line with slope bw. Under this circumstance B — 4 
will depart from zero only because of the presence of sampling error, and 
the expected value of the F ratio is unity. When Hyg; = B» = > + = By is 
not true, B will tend to be greater than A and the expected value of F will be 
greater than unity. 

Table 20.3 uses the data of Table 20.1 to illustrate a test of homogeneity - 
of regression. Sums of squares on Y and X, sums of products, and adjusted 
sums of squares for X have been calculated for each of the three groups _ 
separately. The adjusted sums of squares are summed to obtain А. The Y, 
X, and sum-of-products columns are summed to obtain within-groups sums 
of squares and products. An adjusted within-groups sum of squares on Х К 
calculated, and denoted by В. Ап F ratio is calculated and is found to | 
.37, which is, of course, not significant. E 

In this example the regression slopes for the three groups are respec- 
tively b, = .45, b2 = .07, and 6, = .08. Also, bw = .404. Given the small 
samples used in this illustrative example, the differences between such 
coefficients would not prove to be significant. In fact, the differences are - 


less than chance expectation. t 


XTENDED USE OF THE ANALYSIS ФЕ COVARIANCE 

as described in this chapter, assumes that the 
regression of X on Y is linear. The method may be extended to deal with sit- 
re the regression is nonlinear. Also, we have considered situa- | 
variable or covariate. The analysis of s 
covariance may be used with more than one uncontrolled variable. This | 
involves the use of a multiple regression method of the type described in 
Chap. 26. Our description of the analysis of covariance applies to қ 
single-factor experiments. The method may be adapted for use with 


THE E 
The analysis of covariance, 


uations whe: 
tions involving only one uncontrolled 


300 


EXERCISES 


THE DESIGN OF EXPERIMENTS 


two-way, and higher-order, factorial experiments. For a more comprehen- 
sive discussion of the analysis of covariance the reader is referred to 
Winer (1962). 


1 The following are paired observations for three experimental groups: 


I п HI 

Y X Y X Y x 
a ee 

2 7 8 15 15 30 

5 6 12 24 16 35 

7 9 15 25 20 32 

9 15 18 19 24 38 

10 12 19 31 30 40 
Y.X 6.60 9.80 14.40 22.80 21.00 35.00 


К A ee 


In this example Y is the covariate or concomitant variable. Calculate 
the adjusted total, within-groups, and between-groups sums of 
on X, and test the significance of the differ 
means on X using the appropriate F ratio. 


squares 
ences between the adjusted 


2 For the data of Exercise 1 above calcul 
regression line б, for predicting 
within-groups regression line b, 
regression line by. 


ate: (a) the slope of the over-all 
X from У; (b) the slope of the over-all 
с} (c) the slope of the between-groups 
Calculate for Exercise 1 above the adjusted means on Х. 
Test the homogeneity of the 5] 


opes of the three within-groups regres- 
sion lines in Exercise 1 above. 


NONPARAMETRIC STATISTICS 


21.1 


= 
ж 
г 
> 


THE STATISTICS ОҒ RANKS 


ық Es 


INTRODUCTION 


Ordinal, or rank-order, data may arise in a number of different ways. Quan ^ 
titative measurements may be available, but ranks may be substituted E 
reduce arithmetical labor or to make some desired form of calculation pos- d 
sible. For example, measurements of height and weight may be obtained | 
for a group of school children. A correlation between the paired measure- | 
ments could readily be calculated. The investigator may, however, choose d 
to substitute ranks for the measurements and to calculate a correlation 
between the paired ranks. In many situations where ranking methods аге, 
used, quantitative measurements аге not available. The measuring a 
, 


tions used may be such that no comparative statements about the intervals 
between members are possible. For example, employees may be 
rank-ordered by supervisors on job performance. School children may be | 
ranked by teachers on social adjustment. Different teas may be 
rank-ordered by experienced judges on taste, or participants in a beauty | 
contest may be rank-ordered by judges on pulchritude. In such cases the 
data are comprised of sets of ordinal numbers, Ist, 2d, 3d, . . . , Nth. к 
These аге replaced by the cardinal numbers 1, 2, 3, ... , N for purposes 
of calculation. The substitution of cardinal numbers for ordinal numbers 
always assumes equality of intervals. The difference between the first and‘ 
second member is assumed equal to the difference between the second and | 
third, and so on. This assumption underlies all coefficients of rank correla- 
tion. Because of difficulties associated with the measurement of psycholog- | 
statistical methods for handling rank-order data are of par- | 


Á 


ical variables, 
ticular interest to psychologists. 
Although rank correlation methods have been in use for many years, 


303) 


304 


21.2 


[21.1] 


[21.2] 


21.3 


NONPARAMETRIC STATISTICS 


more recently extensive use has been made of ranks in dealing with many 
other statistical problems. Ranks are used, for example, in tests for com- 
paring two correlated or independent samples, these being the analogues 
of t tests, and for a variety of other purposes. Such tests are described as 
nonparametric or distribution free, and are discussed in some detail in 
Chapter 22 to follow. Rank correlation methods are usually classed as non- 
parametric procedures. 


INTEGERS 
Ranks are represented by the integers 1, 2, 3, ..., М. These аге 
denoted by X,, X4, X4, . . . ‚ Xy. The sum, and sum of squares, of the first 
N integers may be shown to be as follows: 


у NN) 
с ж 


S ovs INQUE 1) QN 2-1) 
MEN. 


The mean of the first N integers is X = (V+ 1)/2 and the variance. 
obtained by dividing the sum of squares about the mean by М, iss? = (№? — 
1)/12. The mean is a simple function of the variance, X = 6s?/(N — 1). А 
knowledge of the above relations is sometimes useful, 


MEASURES OF DISARRAY 


Consider N individuals 41. А, Аз... ‚ Ay ranked оп two variables Х 
and Y. The rankings on X may be denoted as Xi, Xo, Х,,... Xy and 


those on У аз Уу, У,,Ү,...,Ү, Let the ranks for five individuals on X 
and У be as follows: 


X 1 2 3 4 
ү 1 4 3 5 


The X ranks are in their natural order, The У ranks are not in their natural 


order. They exhibit a degree of disarray with respect to Х. How may 4 
measure of disarray in this situation be defined? | 

One commonly used measure of disarray is the sum of squares of dif- 
ferences between the paired ranks, This quantity may be designated by 
У4?. In the above example У(Х — У)? = Уа = (0)? + (—2)2 + (0)? + 
(—1)? + (3)? = 14. It is of interest to consider the minimum and maximum 
values of Ed’. When members are ranked in the same order on both X and 
Y, Ха? = 0 and is a minimum. Thus if the ranks on X are ], 2, 3,4, 5 and on 
Y 1, 2, 3, 4, 5, the differences are all zero. If the paired ranks are іп an 
inverse order, the maximum degree of disarray, Xd? is a maximum. Thus if 


[21.3] 


[21.4] 


21.4 


[21.5] 


THE STATISTICS OF RANKS 305 


the ranks on X are 1, 2, 3, 4, 5 and on Y 5, 4, 3, 2, 1, the di 

‚2, 3,4, ‚4, 3, 2, 1, the differences d 
-4, —2, 0, 2, and 4 and Ed? = 40. No arrangement of Y with respect to X 
will produce a larger value of У. It may be shown that the maximum 


value of Ed? is given by 


xq, = МАИ. 


Ifranks оп У are arranged at random with respect to X, the expected value 


of Ed? is simply one-half Уф ах OF 


gae = МВ 


The statistic Ed? is only one of a number of measures of disarray which 


might be defined. 


Another measure of disarray is the statistic S. Consider again the paired 


ranks, оп X 1, 2, 3, 4, 5 and on Y 1, 4, 3, 5, 2. The X ranks are in their 
natural order and the Y ranks exhibit a degree of disarray with respect to 
X. To calculate 5 we compare each rank on Y with every other rank, there 
being N(N — 1)/2 such comparisons for N ranks. If a pair is ranked in its 
natural order, say 1 and 4, a weight +1 is assigned. If a pair is ranked in an 
inverse order, say 4 and 3, a weight —1 is assigned. The statistic S is the 
sum of such weights over N(N — 1)/2 such comparisons. In the example 
above the weights are лр, ERIS SET, ly 1,15 #11, —1, and 
S = 2. The maximum and minimum values of 5 may be considered. The 
maximum value occurs when both sets of ranks are in a natural order, all 
weights being +1, and is МОМ — 1)/2. The minimum value of S occurs 

ks are in an inverse order, all weights being —1, and is 


when both sets of ran 
—N(N — 1)/2. When ranks are arranged at random with respect to each | 


other, the expected value of S is zero. 
Both Xd? and 5 аге used in the definition of measures of rank correlation 


and for other purposes аз well. 


FICIENT OF RANK CORRELATION p 


SPEARMAN'S COEF 
тау, £d’, is used in the definition of Spearman's coef- 


lation. À coefficient of rank correlation is a statistic 
as to take a value of +1 when the paired ranks are in 
—] when the ranks are in an inverse order, and 
hen the ranks are arranged at random with 
n of rank correlation which meets these 


The measure of disa 
ficient of rank corre 
defined in such a way 
the same order, à value of 
an expected value of zero when 
respect to each other. A definitio! 


requirements js: 


21 24. 
ре Ұйым 


he Greek letter rho. When the paired ranks are in the same 


where p is th 


306 


Table 21.1 


[21.6] 


[21.7] 


21.5 


NONPARAMETRIC STATISTICS 


Calculation of Spearman’s coefficient of rank correlation 


Rank Difference 
Individual х y d d 
EEUU RES о 7 
A, 1 6 =5 25 
А. 2 3 -1 1 
Аз 3 7 -4 16 
As 4 2 2 4 
As 5 1 4 16 
As 6 8 —2 4 
A; 7 4 3 9 
Ag 8 9 E 1 
Ay 9 5 4 16 
Ais 10 10 0 0 
Total 0 У = 92 
—— — ыан 
6x92 
DE Sony ыы 


order, ХФ? = 0 and p = 1. When the paired ranks are in an inverse order, 

IP = Edi, and p ——1. In the case of independence, 2Xd* = УФ у and 

p —0. By substituting d, = N(N? — 1)/3in the above formula we obtain 
ilte 

В МОМ: 1) 

This is the usual way of writing Spearman's coefficient of rank correlation. 


Pearson’s product-moment correlation coefficient may be written in the 
form: 


Z(X-X)(Y-Y) 
М(Х Xy x(y-yy 


Spearman's p is a particular case of the above formula. It is the particular 
case that arises when the variables are the first consecutive untied in- 
tegers. If the above formula is applied directly to paired ranks, the result is 
identical with that obtained by applying the formula for p. 

The calculation of p is illustrated in Table 21.1. The calculation is 
simple. We find the differences between the paired ranks, square them, 
sum to obtain У, and then apply the formula for р. 


SPEARMAN’S р WITH TIED RANKS 


In arranging the members of a group in order, a judge may be unable to dis- 
criminate between certain members. Where measurements are replaced 


Table 21.2 


21.6 


THE STATISTICS OF RANKS 307 


Calculation of Spearman’s coefficient of rank correlation with tied ranks 


Rank Difference 
Individual x Y d di 
А, 1 8 =й 49.00 
А, 25 6.5 —4 16.00 
As 2.5 4.5 -? 4.00 
А, 45 2 2.5 6.25 
As 4.5 1 3.5 12.25 
46 6 3 3 9.00 
А, 8 4.5 3.5 12.25 
As 8 6.5 1.5 2.25 
As 8 9 = 1.00 
А 10 10 0 00 
Total Sd? = 112.00 
SS ты a 
6x 112 
= = 10010021) ^?! 


by ranks, certain measurements тау Бе equal. These circumstances give 
rise to tied ranks. If we attempt to replace the numbers 14, 19, 19, 22, 23, 


23, 23, 25 by ranks, we observe immediately that 19 occurs twice and 23 _ 


three times. Under these circumstances we assign to each member the 
average rank which the tied observations occupy. Thus 14 is ranked 1, the 
two 19’s are ranked 2.5 and 2.5, the 22 is ranked 4, the three 23’s are 
ranked 6, 6, and 6, and 25 is ranked 8. Having replaced the tied ranks by 
their average rank, we proceed as before in the calculation of p. A calcula- 
tion with tied ranks is illustrated in Table 21.2. If the ties are numerous, 
this type of adjustment for tied ranks may not prove altogether satisfactory. 

The development of p from the ordinary product-moment r assumes that 
the ranks are the first № integers. Where tied ranks occur this is not so. 
Where a substantial number of tied ranks is found, the departure of the 
sum of squares of ranks from the sum of squares of the first N integers will 
be appreciable and the value of p will be thereby affected. While other 
procedures for correcting for ties may be used, one convenient approach is 
to calculate an ordinary product-moment correlation for the paired obser- 
vations where average ranks have been substituted for ties. 


TESTING THE SIGNIFICANCE OF SPEARMAN’S p 

The problem of developing a test of significance for p is approached by con- 
sidering the N factorial possible arrangements of У with respect to a fixed 
ranking on Х. Each arrangement is considered equally probable. For each 


N 


308 


[21.8] 


21.7 


[21.9] 


NONPARAMETRIC STATISTICS 


arrangement either the quantity Ed? or p may be calculated. A frequency 
distribution can be made either of the Ed? or of the values of p. This 
is a null distribution and serves as a model for evaluating a particular ob- 
served value of Ed*, or p, against the null hypothesis, Но, that no associa- 
tion between X and Y exists. If the particular observed arrangement of Y 
with respect to X, as reflected in a low Ed, or a high p, is improbable, say 
p € .05 or .01, then the null hypothesis is rejected. To illustrate, for N — 2, 
if X has the ranks 1, 2, only two arrangements of Y with respect to X are 
possible: 1, 2 and 2, 1. Only two values of p are possible, --1 and —1. For 
N = 3, if X has the ranks 1, 2, 3, there are six possible arrangements of Y 
and, as it turns out, four different values of p, —1, —1/2, +1/2, and +1. The 
sampling distribution of Xd? has been studied by Kendall (1943) and others. 
For М=7 or 8 the distribution has а somewhat jagged or serrated 
appearance. The distribution is always symmetrical, As N increases in 
size, the distribution approaches the normal form. 

Table G of the Appendix shows critical values of p for different values of 
N required for significance at various levels. Observe that for a small М, 
values of p of very substantial size must be obtained before we have ade- 
quate grounds for rejecting the hypothesis that no association exists 
between the rankings. For N — 10 we require a p equal to or greater than 
:564 before we can argue that a significant association exists in a positive 
direction at the 5 per cent level. 

With N — 10 or greater, we may test the significance of p by using a t 
given by 


N-2 
EXE a 


This quantity has a ¢ distribution with V —2 degrees of freedom. For 
example, where М = 10 and р = .564, t= 1.93. For eight degrees of 
freedom, the value of ¢ at the .05 level is 2.31. Fora two-tailed test we have 
insufficient grounds for arguing that the observed p is significantly dif- 
ferent from zero. For a one-tailed test the observed p is significant at about 
the 5 per cent level. 


KENDALL’S COEFFICIENT OF RANK CORRELATION 


An alternative form of rank correlation, 7, or tau, has been developed by 
Kendall (1952, 1955). In the definition of Kendall's tau use is made of the 
measure of disarray, S. The maximum possible value of S is N(N —1)/2. 
Kendall's coefficient of rank correlation, т, is defined as the obtained value 
of S divided by its maximum possible value, that is, 


m m 
Т W(N—1) 


The statistic 7 has a value —1 when the paired ranks are in an inverse 
order, and a value of +1 when the paired ranks аге in the same order. For X 
of 1, 2, 3, 4, and 5 and Y of 1,4, 3,5.2,$—2, N — 5, and т 2/10 — .20. 


= 


21.8 


[21.10] 


21.9 


THE STATISTICS OF RANKS 309 


KENDALL’s т WITH TIED RANKS 

If ties occur, the convention is adopted, as in the calculation of Spearman’s 

p. of replacing the tied values by the average rank. A comparison of two 

tied values on У receives a weight of zero. If ties occur on X, а comparison 

of the corresponding paired У values will also receive a weight of zero 3 
regardless of whether the paired У values are tied. Consider the followin 

example with no ties on Х and one tied pair on Y 


X 1 2 3 4 5 6 
Y 2 3 4.5 4.5 1 6 


On comparing each rank on Y with every other rank, and assigning a +1 for 
a pair in their natural order, a —1 for a pair т an inverse order, and a 0 for a 
tie, we obtain +1, +1, +1, —1, +1, +1, +1, —1, +1, 0, — +1, 2 
+1, +1, and S = 6. Consider another example with tied values on both X 


and Y. 


x 15 1.5 3 5 5 5 
Y 2 3 4.5 4.5 1 6 


Here the comparison on Y of 2 with 3 receives a weight of zero, because the 
order of the first two paired values on X is arbitrary. Similarly, comparisons 
on Y involving the last three values will receive weights of zero, because of 
the triplet of ties on X. In the above example the weights are 0, +1, +1,-1, 


+1, +1, +1, 71, +1, 0, —1, +1, 0, 0, 0, and S — 4. 
То calculate tau with tied ranks, S is calculated in the manner described 


and the following formula applied: 


S 
T= JENN =) TJEN- D - 0 


above, 


In this formula T, = 3 у t(t— 1) and 0,= 3 и(и— 1). One ranking 
t ties; the other ranking, г sets of и ties. To illustrate, 
consider the example immediately above with ties on both X and Y. Here 
T, = 4[2(2—1) + 3(3—1)] = 4; also U, = {2(2— 1)] = 1. In this ex- * 
6, and tau is as follows: 


contains m sets of 


ample N = 
4 = .31 
т= à 
VEX$6-1 -4Bx66- 20-1 


THE SIGNIFICANCE OF S AND 7 

ion of S is obtained by considering the N factorial 
ation to X. A value of S may be determined for each 
The distribution of these № factorial values is the 


f S. This distribution is symmetrical. Frequencies 


The sampling distribut 
arrangements of Y in rel 
of the N arrangements. 
sampling distribution o 


310 


[21.11] 


[21.12] 


[21.13] 


NONPARAMETRIC STATISTICS 


taper off systematically from the maximum value toward the tails. The dis- 
tribution of S rapidly approaches the normal form. For N = 10 the normal 
approximation to the exact distribution is very close. The exact sampling 
distributions of S for М = 4 to М = 10 are given by Kendall (1955). 

In testing the significance of the association between paired ranks, it is 
more convenient to apply a test directly to S rather than to 7. The variance 
of the sampling distribution of S without ties is given by 


» МОМ 1) QN +5) 
за 18 


If the normal approximation to the exact sampling distribution of $ is used, 
a correction for continuity should be applied. This is done by subtracting 
unity for the absolute value of S. To apply a significance test, we divide S, 
corrected for continuity, by the standard deviation of the sampling distribu- 
tion to obtain the normal deviate z, as follows: 


|S|=1 
V/N(N — 1) (2N + 5)/18 


As usual, 1.96 and 2.58 are required for significance at the .05 and .01 


levels, respectively, for a nondirectional test. To illustrate, consider the 
paired ranks: 


X 1 2 3 4 5 6 
У 2 4 3 5 1 6 


Here the weights аге +1, +1, +1, —1, +1, —1, +1, Т а sells 
—1, +1, +1, and $ = 5. Hence 


A 5-1 4 
2 = = 
V6(6—1)(2X6+5)/18 5.323 


751 


Here the association between the paired ranks is clearly not significant. 
Because with problems involving ranks ties are very common, it is useful 
to know the variance of the sampling distribution of S when ties occur. If 


ties occur in one set of ranks, and not in the other, 


the variance of S 
becomes 


m 


a? = тя [МОУ — 1) (2N +5) — У e(t — 1) (2t + 5)] 


One set of ranks contains m sets of t ties. Note that the effect of ties is to 
reduce the variance о’,?. Examination of the above formula shows that the 
effect of ties is to reduce the variance by unity for each tied pair, by 3.67 for 
each triplet of ties, and by 8.67 for each quadruplet of ties. Thus we have 
available a very convenient correction procedure. 

If ties occur in both sets of ranks, the variance of S is given by 


ш 


[21.14] 


21.10 


THE STATISTICS OF RANKS 


of = ts [vw nans — 1-0 +5) 


-$ uu-1)2u+5)| ENN DN 
[$ e- б@-[$ 0“ )-2] 
ics Een] 


x 
ne set of ranks contains m sets of ¢ ties: the other set of 4 
ranks, rsets ofu ties. This formula is the more general form for the variance. 23 
ibution of S. As in the untied case, to apply a test of significance, р 
г continuity, that is, |S| — 1, by the standard : 
to obtain the normal deviate z. 


r 


In this formula o 


of the distr 
we divide S, corrected foi 
deviation of the sampling distribution 0; 


RANK CORRELATION WHEN ONE 
VARIABLE IS A DICHOTOMY 


The statistic S, and the rank correlation coefficient tau, may be calculated 


when one set of ranks is a dichotomy- Consider an example given by 
Kendall (1955) for 15 boys and girls ran!:ed on an examination. Here sex is a 
dichotomous variable. Let X be ranks on the examination, and Y sex. | 


о 
y B B 6 В. Ga 1G, В B. BIG B G B. сия 
may be considered tied on Y; also the seven girls. 2 
nk procedure, а rank of 4.5 may be assigned t0 
d a rank of 12 to each of the seven girls. The eu 
anks 1 to 8, and the rank 12 is the average of the | 


Here the eight boys т 
Adopting the average-ra 
each of the eight boys ап 
4.5 is the average of the r: 
ranks 9 to 15. Thus we may write 
Te B g gor 51 92 15 14 І5 


р рано 
12 4 12 12 


X 
y 44 4 12 4 12 12 4 4 4 12 а 


from these ranks is 


+4+4—2 +3—1+2+0=18 
formula [21.10], and is M 


The statistic S calculated 


g=7+7-6+6-5—54+4 


The coefficient tau is obtained using 


18 = .235 
т= 71515015 RX 15052-49]. 
— 0 and U, = 3(8 + 7) +4(7 X 6). In this example the rank 4.5 


has been assigned to boys and the rank 12 to girls, resulting in a positive 
n. This, of course, is arbitrary. The ranks could have been 


correlation obtained. 


where Т, 


correlatio 
reversed and a negative 


312 


21.11 


21.12 


NONPARAMETRIC STATISTICS 


To test the significance of the association, the standard error о; is 
calculated and a normal deviate z obtained. In the present illustrative 
example X contains no Нез, У contains two groups of ties, and о, may be 
obtained using the square root of formula [21.13] as follows: 


о, = Vis(15 х 14X35—8X7X21—7XOX 19) = 17.28 


Subtracting unity from the absolute value of S as a continuity correction 
and dividing by о’, we obtain the normal deviate 


z= (|S| — 1)/o, = 17/17.28 = .984 


Clearly the association between ranks on the examination and sex is not 
significant. 

When one variable is a dichotomy and no ties occur in the Х ranks, S is 
reduced by unity as a continuity correction. When one variable is a dichot- 
omy and the other contains m groupings of values of extent t, the absolute 
value of S is reduced by (2N — t, — t,)/2(m — 1), where t, > 1. Here t, 
and tm are the number of tied values in the first and last groups. The reader 
should note also that when ties occur in the X variable, formula [21.14] 
and not formula [21.13] should be used in calculating с,. 


COMPARISON ОЕ p AND 7 


The coefficients р and т, although used for the same purpose, are on a 
somewhat different scale, If p and т are calculated on the same data, the 
absolute value of p will be larger than the corresponding value of 7. Values 
of p and are highly correlated in samples from a bivariate normal popula- 
tion. When the correlation in the population is zero, the product-moment 
correlation between p and т is .980 for N — 5 and approaches 1 as № 
approaches infinity. For practical purposes the correlation between the two 
statistics can be regarded as close to 1. 

On considering the N factorial arrangements of one set of ranks in rela- 
tion to another, the distributions of both р and rare symmetrical and tend to 
the normal form for large N. The distribution of т tends to approach the 
normal form more rapidly than that of p. The exact distributions are known 
for higher values of than of p. In general т as a statistic is more amenable 
to mathematical manipulation than р. Problems resulting from tied values 
are more readily solved. Also, the measure of disarray, S, used in the de- 
finition of 7, seems to have a degree of generality about it that does not 
characterize У4?. S has a number of applications apart from its use in 
correlation. 


THE COEFFICIENT OF CONCORDANCE W 


For data comprised of m sets of ranks, where m > 2, a descriptive measure 
of the agreement or concordance between the m sets is provided by Ken- 
dall’s coefficient of concordance W. The data of Table 21.3 consist of six 


Table 21.3 


[21.15] 


[21.16] 


THE STATISTICS OF RANKS 313. 


Ranks assigned to six job applicants by four interviewers 


Applicant 


Interviewer a 
es 
A 6 4 1 2 3 5 
B 5 3 1 2 4 6 
е 6 4 2 1 3 5 
D 3 1 4 5 2 6 
R; 20 12 8 10 12 22 


ranks assigned by four judges. These data were obtained in an investiga- 
nique. Four interviewers were required to inter- - 
d rank order them on suitability for employment. 
e observed between the four interviewers, one 
applicant would be assigned a 1 by all four. The sum of his ranks would be 
4. Another applicant would be assigned a 2 by all four interviewers. The 
sum of his ranks would be 8. The sum of ranks for the six applicants would | 
be 4, 8, 12, 16, 20, and 24, not necessarily in that order. In general, when 
perfect agreement exists among ranks assigned by m judges to N members. 
the rank sums are m, 2m, 3m, Am, . . . , Nm. The total sum of N ranks Im 
m judges is mN(N + 1)/2, and the mean rank sum is m(N + 1)/2. " 
The degree of agreement between judges reflects itself in the variation in 
the rank sums. When all judges agree, this variation is a maximum. Dis- 
agreement between judges reflects itself in a reduction in the variation of 
rank sums. For maximum disagreement the rank sums will tend to be more 
or less equal. This circumstance provides the basis for the definition ofa 


coefficient of concordance. 
Let R; represent the rank 
of rank sums for № individuals is 


b» 2 
5- > (R теқ 25) 
Тһе maximum value of this sum of sq 
exists between judges and is equal to т?( 
concordance W is defined as the ratio of St 
of S and is 

- 125 

V = пи — №) 


tion on interviewing tech 
view six job applicants ап 
If perfect agreement wer 


sum of the jth individual. The sum of squares | 


uares occurs when perfect agreement | 
№ — №)/12. The coefficient of 


о the maximum possible value 


ment exists between judges, W = 1. When maximum 
W = 0. W does not take negative values. With more 
than two judges complete disagreement cannot occur. For example, if 4 
and B are in complete disagreement and 4 and C are also in complete dis- 
agreement, then B and C must be in complete agreement. 


When perfect agree 
disagreement exists, 


314 


[21.17] 


21.13 


[21.18] 


[21.19] 


NONPARAMETRIC STATISTICS 


In the example of Table 21.3 the rank totals are 20, 12, 8, 10, 12, and 22. 
The sum of ranks is 84. The mean rank total, the rank sum expected in the 
case of independence, is & = 14. The sum of squares of deviations about 
this mean is 


S= (20 — 14)? + (12— 14)? + (8— 14)? 
+ (10 — 14)? + (12 — 14)? + (22 — 14)? = 160 


In our example m — 4 and N — 6 and the coefficient of concordance is 


12x160 _ 
® (6—6) = 511 


The concordance among m sets of ranks may be described by calculating 
Spearman rank-order correlation coefficients between all possible pairs of 
ranks and finding the average value, denoted by р. This average is related 
to W. The relation is given by 


For the particular case where m = 2 the relation is p =2W — 1. For W = 0, 
p — —1, for W = .5, p = 0, and for W-1,p-]1. 


THE COEFFICIENT OF CONCORDANCE WITH TIED RANKS 


Where tied ranks occur, proceed as before and assign to each member the 
average rank which the tied observations occupy. If the ties are not 
numerous, we may compute W directly from the data without further 
adjustment. If the ties are numerous, a correction factor is calculated for 
each set of ranks. This correction factor is 


(вв) - 
T= 12 


For example, if the ranks on Х are 1, 2.5, 2.5, 4, 5, 6, 8, 8, 8, 10, we have 
two groups of ties, one of two ranks and one of thr 
factor for this set of ranks for X is 


— (28 — 2) + (3—3) ey 
LS eee 


ee ranks. The correction 


2.5 


A correction factor T is calculated for each of the m sets of ranks, and these 
are added together over the m sets to obtain УТ. We then apply a formula 
for W in^which this correction factor is incorporated. The formula is 


S 


= aS N)—mET 


The application of this correction tends to increase the size of W. The cor- 
rection has a small effect unless ties are quite numerous. 


|: 


= 


THE STATISTICS OF RANKS 315 


21.14 THE SIGNIFICANCE OF THE COEFFICIENT 


[21.20] 


` 21.15 


OF CONCORDANCE W 

For N of 7 or less, values of W required for signi 

Жіп levels have been tabulated i n RE 2251 кте р E. 
cendall (1955) and Siegel (1956). A useful adaptation of thi EH E. 

given by Edwards (1967). Critical values of W depend b m € В 

number of sets of ranks, and on N, the number of ranks in = 5 т ШЖ 

greater than 7, а x? test may be applied. Calculate the Pc dang ы” 


Х-т(М-І)Р 
This has a chi-square distribution with V—1d 
egrees of fi F 
data of Table 213, 8 = 160, W= 57, m= 4, and ол 
Edwards’ table provides critical values of .505 and .621 for бел E. 
at Ж 


the 5 and 1 per cent levels. If we apply the chi-square test to the same dat 
we obtain a, 
x? = 4(6 — 1).571 = 11.42 i 


required for significance are 11.07 and 


For df=6— 1= 5 the values of x* 

15.09 at the 5 and 1 per cent levels, and as before we are led to th 1 

conclusion of significant association at the 5 per cent level. Of cours a j 

this case the tabled values are to be preferred because N is less than "IM Я 
м 


Nless than 7 the chi-square test will provide a very rough estimate of the 

required probabilities. Other procedures for testing the significance of W | j 
exist. For a more thorough discussion of this problem see Edwards (1967). | 
FFICIENT OF CONSISTENCE K ` | 
оп an attribute, the objects may be ( 
ble pairs and a judge required to make a 
tation of each pair. Thus a choice is made between 
ther object. This procedure is known as the 


nd has been widely used in psychological 
d to yield a more reliable ordering than 


that obtained by requiring a judge to order a whole group of objects di- y 
rectly. The number of possible pairs is the number of combinations of V 
things taken two at à time, or N(N — 1)/2. As N increases, the number of 3 
comparisons increases very rapidly; consequently for large N the method is 
frequently impractical. 

In the method of paired comparisons we may wish to ascertain the con- | 
sistency of the choices made. Let A, B, and C be three objects. If 4 is 
preferred to B and B is preferred to C, consistency of judgment would 
require that 4 be preferred to C. If C is preferred to А, this latter choice is 
clearly inconsistent with the two previous choices. What meaning attaches 
to the presence of inconsistent choices? Let А, B, and C be red, blue, and 

ge may prefer red to blue, ж 


each of a different saturation. А judi 


THE COE 


To obtain a ranki 
presented two at à ti 
choice on the presen 
every object and every 0 
method of paired comparisons à 
work. The method is usually assume 


ng of objects 
me in all possil 


yellow cards, 


316 


Table 21.4 


NONPARAMETRIC STATISTICS 


blue to yellow, and then may indicate a preference of yellow to red. This 
inconsistent choice may result because the judge may be unable to dis- 
criminate and may indicate preferences in a more or less haphazard 
fashion. Many inconsistent choices in the method of paired comparisons 
result because the task requires a refinement of discrimination which is 
beyond the capacity of the judge. Inconsistent responses may also arise 
because the dimension of judgment has changed. The red card may be 
preferred to the blue and the blue to the yellow on the basis of hue. The 
yellow may be preferred to the red on the basis of saturation. A different 
dimension is used as a basis of choice and leads to the presence of an 
inconsistency. To illustrate further, an orange may be preferred to a peach 
because of its color, a peach may be preferred to a pear because of its 
flavor, a pear may be preferred to an orange because of its shape, and thus 
an inconsistency arises, Where inconsistencies are numerous, a question 
may attach to the meaning of the rank ordering of objects obtained. It is 


is an inconsistent triplet, or triad, of choices. For any set of paired compari- 
sons between № objects the number of inconsistent triads may be counted 
and used to define a coefficient of consistency of response. 


By Go. a uH, T is preferred to B, and a 1 is entered in the cell corre- 
sponding to row А and col. В above the main diagonal. A complementary 0 


Я в € р Е Р с Ж 4 Е qim 
— АЕА A Л м; 
4 ES 1 0 0 1 1 1 0 1 5 1 
В 0 — 1 1 1 1 0 1 1 6 4 
C 1 0 — 0 0 1 1 1 1 5 1 
р 1 0 it — 1 1 1 1 1 7 9 
Е 0 0 1 0-- 1 1 1 0 4 0 
Р 0 0 0 0 Б = T 1 1 3 1 
G 0 1 0 0 0 0 — 1 1 3 1 
H5 ЮО ae 1 9 
го 2 4 


=4  ZX(R—-Ry-30 


125(К- К)? _ 12х30 _ 
Км 90-1 = 990 


. [21.21] 


[21.22] 


[21.23] 


[21.24] 


[21.25] 


FTT b sa, 


THE STATISTICS OF RANKS 317 


is entered in col. 4 and row B below the main diagonal. All other choices 
may be similarly represented. We note that where no response inconsis- 
tencies are present, all entries on one side of the main diagonal are 1’s and 
all entries on the other side 0’з. In this table the presence of some 0’s above 
the main diagonal and the complementary Гз below it indicate the pre- 
sence of inconsistencies. Let us now sum the rows of Table 21.4. If no 
inconsistencies were present, the row sums would be the numbers 8, 7, 6, 
5,4, 3,2, 1,0. Because of the presence of inconsistencies, the actual obtained ` 
numbers are 7, 6, 5, 5, 4, 3, 3, 2, 1, although not in that order. The effect of 
inconsistencies is to reduce the variability of the numbers obtained by 
adding up the rows of the response pattern. Denote a row sum by R. The 
mean of the row sums is R= ER/N, which may be shown equal to 
(N — 1)/2. The sum of squares of row sums is: 

ўв 
zir- Re ze МАЕ 


It is appropriate to consider the maximum and minimum values of this sum 
of squares. The maximum value of E(R — R)? occurs when no inconsis- 
tencies are present in the response pattern and is equal to N(N* — 1)/12. 
The minimum value of У (К — К)? depends on whether М is odd or even. If 
N is odd, the minimum value of E(R — К)? is 0. If М is even, it may be 
shown that the minimum value of =(R — R)? is not 0, but is N/4 (Kendall, 
1943). We then define a coefficient of consistence of response K as follows: 


observed sum of squares — minimum sum of squares 
uares — minimum sum of squares 


maximum sum of sq 


Simple substitution shows that if N is odd 


K= 125(R — R)? 
^ NN 1) 


and if N is even 


ka= 125 (R — R)? — 3N 
~~ М№-4) 


This is Kendall’s co 
when responses are assigned at random, 
tency, and 1 when no inconsistencies are pres 
The calculation of K is illustrated in Table 21 
S(R — R)? = 30, and K = .500. | 
How may the coefficient K be interpreted? The number of inconsistent ў 
triads of the kind A > В->С->А тау be denoted by d, which is related to 
the coefficient K. It may be shown that when № is odd 


мм —1)(1—К) 
g= 24 


efficient of consistence. It has an expected value of 0 
the case of maximal inconsis- 


ent. 
4, In this example N is odd, 


and when № is even 


"m 


318 


[21.26] 


21.16 


[21.27] 


[21.28] 


NONPARAMETRIC STATISTICS 


а= NN -420-K) 

24 
In the example of Table 21.4, the number of inconsistencies d is found to 
be 15. The maximum possible number of inconsistencies is 30. Thus 
one-half the triadic relations are inconsistent, the other half consistent, 
and К = .50. A К of .20 would mean that four-fifths of the relations were 
inconsistent and one-fifth consistent. 


THE SIGNIFICANCE OF THE 
COEFFICIENT OF CONSISTENCE 


The significance of the coefficient of consistence may be approached by 
considering the distribution of the number of triadic relations where 
choices are made at random. Kendall (1955) provides a table of probabili- 
ties that particular values of d will be attained or exceeded for N = 2 to 7. 
For N 7, Kendall has shown that a X! test may be used which provides 
approximate probabilities. The quantity 


8 " 
“т (869-441) + 


Ваз an approximate X? distribution with degrees of freedom given by 


_MW-1(W-2) 
Е 


The term C," in the expression for Х? is the number of combinations of № 
things taken three at a time, or N!/3!(N — 3)!. In using this test the 
required probability that a value of d equal to or greater than that obtained 
will result where choices are allotted at random is the complement of the 
probability of x?. 

For the data of Table 21.4, N =9 and d = 15. We have 


9x8x1 
а= 9а) = 20.16 


8 (1.9 1 
2 = 
“== G e DE 5) + 20.16 = 28.96 


The probability associated with this value of X? is greater than .99, This 
means that the significance level for d is less than -01, the complement of 
.99. We conclude that the consistency represented in the data is greater 
than we could reasonably expect on the assignment of choices at random. 
The coefficient of consistence К = .50 may be said to be significantly dif- 
ferent from zero at better than the .01, or 1 per cent, level. 


EXERCISES 


“6 Three judges rank order a grou 


THE STATISTICS OF RANKS 319 


1 Calculate Spearman’s rank correlation coefficient for the followin; 
paired ranks: 1 
X 1 2 3 4 5 6 
Y 2 4 5 1 6 3 


Does the coefficient obtained differ significantly from zero? 


2 Convert the following measurements to ranks: 
X 4 4 Y 7 7 9 16 17 21 25 


Y 8 16 8 8 16 20 12 ДЫ 5 320, 


Calculate Spearman's rank correlation coefficient. Does the coefficient. 


obtained differ significantly from zero? 


g values of p significantly different from zero? (a) p 
15, (c) p = -70 for N = 10. 


ll 


А Л.Р ыш. ee T VETT 


3 Are the followin: 
30 for № = 25, (b) p = .60 for N= 


ulate the statistics S for the following sets of paired ranks: 


4 Calc 
a X 1 2 3 4 5 6 
id 6 2 1 5 
b X 1 4 6 
Y 6 3.5 1.5 1.5 3.5 5 
c X 2 2 2 3 4 5 
Y 6 3.5 1.5 1.5 3.5 5 
ах 2 2 2 5 5 5 
Y 6 2 1 3 5 
e X 2 2 5 5 5 
Y 6 3.5 1.5 1.5 3.5 5 
b X 2 2 5 5 5 
Y 5 5 2 2 2 5 


nces for the statistic 5 obtained for th 


5 Calculate the sampling varia 
case obtain the normal 


sets of paired ranks in Exercise 4 above. In each 


deviate z with a continuity correction. 
p of seven students on an examination 


as follows: 

Student 
А | Val sr 2406 7 
B 27 Қ, 5 7 


C 1 2 9 


v 


20 


NONPARAMETRIC STATISTICS 


Compute the Spearman rank coefficients between judges and the coef- 
ficient of concordance. 


А supervisor ranks six employees 4, B, C, D, E, and F on job perform- 
ance using the method of paired comparisons. The data are as follows: 


A>B,A>C,A>D,E >A, Е >A, B>C, р В, В Е, 


Calculate a coefficient of consistence for the table composed of the first 
five rows and columns of Table 21.4. 


T 


22.1 


NONPARAMETRIC 
TESTS OF SIGNIFICANCE 


INTRODUCTION 

s of significance involve assumptions about the nature of the dis- 
he populations from which the samples are 
drawn. The ¢ test and the analysis of variance, for example, assume 
parent distribution. In experimental work situations arise 
bout the population distribution of the depen- 
dent variable or this distribution is known to depart appreciably from the 
m. In such situations nonparametric tests may be appropriately 
used. Nonparametric tesis make few assumptions about the properties of 
the parent distribution. Assumptions about the parent distribution are; 
involved in nonparametric tests, but these are usually fewer in number, 
weaker, and easier to satisfy in data situations. Nonparametric tests аге 
frequently spoken of as distribution-free tests. The implication is that they 
are free, or independent, of some characteristics of the population dis- 


Many test 
tributions of the variables in t 


normality of the 
where either little is known a 


normal fori 


tribution. 

The reader will recall the distinction between nominal, ordinal, interval, 
and ratio variables. Nonparametric methods are appropriate for nominal 
and ordinal data; parametric methods for interval and ratio data. In prac- 
tice, nonparametric methods are frequently used with data of this latter 


type. The data are reduced to a form such that a nominal, or an ordinal, sta- 
them. An important class of non- 


tistical procedure may be applied to 
properties of the data. АП observa- 


parametric tests employs only the sign 
tions above a fixed value, such as the median, may be assigned a plus, and 
all below, a minus. The original variable is replaced by, or transformed to, 


another variable which takes the sign values plus or minus. Another class 
321 


322 


NONPARAMETRIC STATISTICS 


of nonparametric test employs the rank properties of the data. The original 
observations are replaced by the numbers 1, 2, 3,... ‚ М. Subsequent 
statistical manipulation and inferences are based on ranks. 

In applying a conventional test of significance, such as a two-sample ¢ 
test, estimates X, and X, of the population means Hı and и» are used to test 
the null hypotheses Но: = ш» under the assumption that the distributions 
are normal and the variances are equal, о? = с, In many nonparametric 
procedures the null hypothesis under test is not formulated in terms of the 
parameters of parent populations, nor are estimates of population para- 
meters calculated. A number of commonly used two-sample procedures 
test the null hypothesis in its most general form. This null hypothesis is 
that the samples come from populations with the same distribution. This is 
tested against the alternative that samples come from populations with dif- 
ferent distributions. Of course, if we are willing to assume in a two-sample 
test that the distributions are normal and g;? = o? then if the distributions 


are in effect tests of medians. If the assumption is made that the parent 
populations are symmetrical, such tests become also tests of means. 

The question arises as to the criteria to be applied in comparing alter- 
nate procedures for testing the same hypothesis. The procedures to be 


е probability of rejecting the null 
hypothesis when that hypothesis is false, It is one minus the Type II error, 


ог 1 — В. The power of a Statistical test depends on the level of signifi- 
cance, the alternate hypothesis Нь, and the sample size. Two tests, A and 
B, may be compared by considering the relative sam 
make them equally powerful, The relative е, iciency 
by М./М,, where N, is the number of observations required to make test B 


ple size required to 


efficiercy, which is the limiting value of the ratio NalN, as N approaches 
infinity and H, approaches the null hypothesis Ho. Questions can be raised 
regarding the practical usefulness of the asymptotic relative efficiency, 
since in most cases interest does not reside in large sample size or in alter- 


A 


22.2 


NONPARAMETRIC TESTS OF SIGNIFICANCE -323 


nate hypotheses close to the null hypothesis. The asymptotic relati 
efficiency of many nonparametric tests is known relative to the most e: 
erful test available, which for two-sample tests is frequently the t к. 
Comparisons with the г test, for example, are made for normal distribution’ 
where both a conventional and a nonparametric procedure may be applied. ` 
For a comprehensive treatment of nonparametric tests the reader ig 
referred to Siegel (1956) and to Bradley (1968). Both books contain useful 


tables. 


A SIGN TEST FOR TWO INDEPENDENT SAMPLES 


This test is known as the median test. It compares the medians of two 
independent samples. The null hypothesis is that no difference exists 
between the medians of the populations from which the samples are | 
drawn. The corresponding parametric test is a t test for comparing the 
means of independent samples. The median test is based on the idea that 
in two samples drawn from the same population the expectation is that as 
many observations in each sample will fall above as below the joint median. 
The data consist of two independent samples of N, and №, observations. 
To apply the median test the median of the combined №, + №, observations | 
is calculated. In each sample, observations above the joint median are 
assigned a + and those at or below it a — The number of + and — signs for 
each sample is ascertained. А x? test is used to determine whether the | 
observed frequencies of + and — signs depart significantly from expecta- 


tion under the null hypothesis. * 
The following are observations for two independent samples: 


10.10. 10... 12^ 35. 177 17 19 20 22 25 26 
12 16 19 19 22 + 
4 


The median of the N, + №, observations is 16. Assigning a + to values у 


Sample I 
Sample П 6 7 8 8 


above the median and a — to values at or below it, we obtain 

E 
Sample I zz сы, dfe aeo uoi pat EP + En 
баш" кт = СТ AUT iim i 


ese data may be tabulated in the form of a 2 X 2 table as follows: 


Th 


324 


22.3 


NONPARAMETRIC STATISTICS 


The value of x? for this table with Yates's correction for continuity is .51. 
The value of x* required for significance at the 5 per cent level is 3.84. 
Obviously, in this case we have no grounds for rejecting the null hypothesis 
that the samples came from populations with the same median. This is a 
two-tailed test. 

The asymptotic relative efficiency of the sign test for independent 
samples when compared to the ¢ test under the assumptions of normality 
and equal variance is 2/7 = .637. 


A SIGN TEST FOR TWO CORRELATED SAMPLES 


This test compares two correlated samples, and is applicable to data com- 
posed of N paired observations. The difference between each pair of obser- 
vations is obtained. The null hypothesis is that the median difference 


` between the pairs is zero. The test is based on the idea that under the null 


hypothesis the expectation is that half the differences between the paired 
observations will be positive and the other half negative. The symmetri- 
cal binomial (2 + 3)" is used to obtain the probabilities required for a 
one-tailed or a two-tailed test. 

The following are paired observations, X and У, fora sample of 10 indi- 
viduals together with the sign of the difference between X and Y: 


Sign of X у = - + + 0 + + + = + 


Under the null hypothesis the probability that X is greater than Y is equal 
to the probability that Y is greater than X, which in turn is equal to $. The 
expected numbers of + and — signs are equal. In this example we have six 


plus signs, three minus signs, and one zero difference. The zero difference 
is discarded. From the binomial expansion ($+ 1)9 


ig we can ascertain the 
exact probability of obtainin 


£ six or more plus signs under the null 


-508. This is a two-tailed test. 
the null hypothesis. 


Where N is not too small, the normal approximation to the binomial or at 
may be used, preferably with Yates's correction. In this case the expected 
values are N/2. In the above example the observed values are 6 and 3, the 
expected values are 4.5 and 4.5, the corrected observed values are 5.5 and 
3.5, and x? = .44. The probability of obtaining a x? equal to or greater than 
-44 under the null hypothesis is .507. Although М is small, this is in close 
agreement with the exact probability of .508 obtained from the binomial. 
The reader will recall that X? provides the probability for a two-tailed test. 

Instead of the x? procedure described above, a computationally simpler 


ее 


22.4 


NONPARAMETRIC TESTS OF SIGNIFICANCE 325 


method may be used. Obtain the difference between the number of + and — 
signs. Denote this difference by D. It may be shown that 


„101-1 
VN 

approaches the normal form as N increases in size. This formula incorpo- 

rates a continuity correction. Values of 1.96 and 2.58 are required for sig- 

nificance at the .05 and .01 levels of significance, respectively, for a non- 

directional test. In the above example we have six plus signs and three 


minus signs. D = 6 — 3=3, and 


Е, 4 


A/S 
Quite clearly this falls short of significance. The reader should recall that 
for df= 1 the quantity 2 = x^. In the above example, if the calculation is 
carried beyond two decimal places, zi = .6667° = .4444 and x? = .4444. 

The asymptotic relative efficiency of this test when compared to the cor- 
responding ¢ test, and under the required assumptions, is 2/7 = .637. 


A SIGN FOR k INDEPENDENT SAMPLES 
This is an obvious extension of the median test for two independent 
samples. The data are comprised of k samples of ny, ns, . . . ‚п observa- 
tions. As before, the null hypothesis is that no difference exists in the 
medians of the populations from which the samples are drawn. The median 
of the combined n, + п. +- * n observations is calculated. For each 
sample, observations above the joint median are assigned а + and those 
either at or below the joint median a —. 
contingency table, and a x? test applied. 
The following are data for four samples. 


Sample I 3 6 11 14 16 18 21 33 
Sample II 3 3 4 5 5 8 9 14 
Sample III 18 18 25 26 29 31 

Sample IV 14 16 19 22 22 25 21 35 


The total number of observations is 30. The median is 17. Assigning a + to 


values above the median and a — to values at or below, we obtain 

Sample I = = = = = = + Е 

Sadar == = = = ANS АШЫ 
- + + + + 


Sample Ш = 
Sample IV — = + 


. The data are arranged ina2xk _ 


326 


22.5 


INPARAMETRIC STATISTICS 


These data may be arranged in a 2 X 4 table as follows: 


4 е 
Sample I 2; 6 | 8 
| 
Sample II 0 | "8 8 
Sample III 4 | 2 6 
Sample IV 6 | 2 8 
EID 1 
12 8 30 


The value of x? calculated on this table is 7.56. The number of degrees of 
freedom is (4— 1) (2 — 1) —3. The value of x? required for significance at 
the 5 per cent level is 7.82. The observed value falls just below this. 


A RANK TEST FOR TWO INDEPENDENT SAMPLES 


The most commonly used rank test for comparing two independent 
samples is the Wilcoxon rank sum test. Tests which are equivalent to the 
rank sum test have been developed by Mann and Whitney, and others. The 
hypothesis under test here is that the two samples come from populations 
with the same distribution. If assumptions are made regarding the equiva- 
lence of shapes and variances of the two distributions, the Wilcoxon test 
becomes a test of central location. 

In applying the Wilcoxon test the М, and М» observations are combined. 
The N, + №, observations are then arranged in order. A rank 1 is assigned 
to the smallest value, a rank 2 to the next smallest, and so on. The sum of 
ranks, А, is obtained for the smaller of the two samples, if the samples are 
unequal in size. If the samples are equal in size, either rank sum may be 
used. R, is then evaluated in relation to its distribution, which is discussed 
below. 

The model distribution against which particular values of R are eval- 
uated is obtained by considering a finite population of integers comprised 
of М, + №, = М members. These integers аге 1, 2, 3, .. . ‚ М. The 
problem of drawing samples of size №, from this population of М, +N, = № 
members may be considered. The number of Possible equiprobable sam- 
ples is Су“. A value R, may be obtained for each sample and a frequency 
distribution made of the values of Кү. This distribution can be used to eval- 
uate particular values of R,. If a particular value of R, has a small probabil- 


E 


[22.1] 


[22.2] 


[22.3] 


NONPARAMETRIC TESTS OF SIGNIFICANCE 327 А 


ity in relation to this distribution, we reject the hypothesis that th 
samples come from the same population. The reader should 40% Е E E | 
distribution is closely related to the sampling distribution of m + E i 
finite population discussed in Sec. 9.5. Here the sampling аара D 
sums, and not means, is under consideration, and the dues d Е ; 
which ee are drawn is one of integers extending from 1 to М The "i 
iia of this distribution, R,, is М, times the mean of the М, +N, ranks and. 


М, (М, + №+1) 


Mean = R, = 2 ү 
The variance of the distribution of R, can be shown to be B 
a 
Variance = ов? = мм ХААВ y 
» 5 
n H H "x 
The exact distributions of R, are known for №, and №, up to 25. The dis- 
tribution approaches the normal form fairly rapidly. When both №, and Na 
are equal to or greater than 8 or 10, a large-sample procedure using HA x 
normal approximation and a continuity correction will lead to estimates of . 
the required probabilities which do not differ much from those obtained 3 
from the exact distributions. The normal deviate z with a continuity correc- | 
tion is given by: И 
1. Bem 
ГММ (№, + №+1) 
12 
If this value is equal to or greater than 1.96 or 2.58, we reject the null - 
hypothesis for a nondirectional test at the .05 or .01 level and accept the 
alternative hypothesis that the samples are from different populations. For 
a directional test the .05 and .01 levels are 1.64 and 2.33. 
Consider the following observations: қ 
Sample I ӨТ 188 801 52 оду 187 60: 02718 117. 
Sample II 6 9 4 16 2 4 $ 47 50 55 63 72 
Assigning ranks, proceeding from the smallest to the largest values, we 
obtain 
Sample I 5 Xu Ва 16 18 19 20 22 
Sample П 1 2 К 4 6 9 10 1 12 15 ІШ ӨЛІ М 


The sum of ranks R, for sample I is 142. The mean of the distribution of R;, 


that is, Ry, is 115. The normal deviate is 


328 


[22.4] 


NONPARAMETRIC STATISTICS 


[142 — 115| 2 1 
10 x 12 (10 + 12+ 1) 
12 


1.71 


Since this falls below 1.96, we have no grounds for rejecting the null 
hypothesis for a two-tailed test. The result is, however, significant at the 5 
per cent level for a one-tailed test. 

When ties occur, the tied observations may be assigned the average of 
the ranks they would occupy if no ties had occurred. If ties are fairly 
numerous, a correction may be applied to the standard deviation in the 
denominator of the z ratio. Corrected for ties, that ratio becomes 


М0] Cz - м) 


where М = М, + М, and T= (2 — t)/12, where t is the number of values 
tied at a particular rank. The summation of T extends over all groups of 
ties. 

The above procedures use the normal approximation to the distribution 
of R,. If an exact test is required, Table K may be used. This table shows 
the exact lower-tail critical values of R, for N, and М, up to 25 at probability 
levels equal to or less than .10, .05, .025, .01, :005, and .001. For example, 
the table shows a p — .05 for R, = 19 where N, = 5and М, = 5. This means 
that the probability of obtaining a value equal to or less than R, in samples 
of this size is equal to or less than .05. This is a directional test which uses 
the lower tail of the distribution. Since the distribution is symmetrical, the 
corresponding upper-tail values are given by noting that a lower-tail value 
К, is К, — К, points below the mean. The corresponding value above the 
mean is R, + (R, — R,) =2R, —R,. If В, is an upper-tail value, that is, if it 
is above the mean, Table K is entered with 2R, — R,. The null hypothesis is 
rejected if it is smaller than the critical value. To assist calculation Table K 
shows values of 2R,. In the illustrative example above, А, is 142 for N, = 10 
and №, = 12. The mean value R, is 115. The lower-tail value corresponding 
to К, is 2R, — В, = 230 — 142 = 88. Entering Table К we note that p is 
slightly less than .05. Thus we may assert significance at this level for a 
directional test. For а nondirectional or two-tailed test, either R, or 
2R, — R,, whichever is appropriate, is referred to Table K and the proba- 
bilities in the table are doubled. The appropriate choice between R, and 
2R, — В, is, of course, always the smaller value. 

The above discussion sounds rather complex. In practice the procedure is 
simple. First, calculate R,. Calculate R, using formula [22.1]. Second, if R, 
is less than R, refer R; to Table К to obtain the required probability. Third, 
if R, is greater than R, calculate 2R, — R,, and refer this quantity to Table 
K to obtain the required probability. 


NONPARAMETRIC TESTS OF SIGNIFICANCE 329 


The Wilcoxon rank sum test is i 
у 5 is in effect the same as the Mann- Whi 
s i $ 4 
test. Mann and Whitney (1947) studied the distribution of a st ud 
which is related in a simple way to Кү. ена 


[22.5] Ui = №№ ауу — В, 


(22.6) з= №№ + MULT -R 


U is the smaller of these two v i iti 
Vil erepti alues. Tables showing critical values of аге | 
The Wilcoxon rank sum test has an asymptotic relative efficiency wh 1 
compared with the ¢ test for independent samples of 3/7 = .955. This cca 
parison assumes that the distributions are normal. If the distributions are 
rectangular, the asymptotic relative efficiency is 1.00. For certain other 
types of distributions the asymptotic relative efficiency is greater than 1.00. 
‘All the available evidence indicates that the Wilcoxon rank sum test is M 


excellent alternative to the test. 


A RANK TEST FOR TWO CORRELATED SAMPLES 
s due to Wilcoxon and is usually called the 


The rank test described here i 
Wilcoxon matched-pairs signed-ranks test. The data are a set of N paired 
observations on X and Y. The difference, 4, between each pair is calcu- 


lated. If the two observations in a pair are the same, then d = 0 and the pair 
is deleted from the analysis. Values of d may be either positive or negative. 
The d's are then ranked without regard to sign, that is, the absolute values 
|X, — У] are ranked. A rank of 1 is assigned to the smallest d, of 2 to the 
d so on. If two or more d's are tied, the practice usually 
he tied ranks the average of the ranks they would 
have been assigned if they had differed. The sign of the difference d is 
attached to each rank. If d is positive, the rank is positive; if d is negative, 
the rank is negative. Denote the sum of the positive ranks by W4 and the 


sum of the negative ranks by //-. 

The model null distribution against which 
the following way. If the two samples X and 
population, then the probability that X, — Y, is either plus or minus is one- 
half. Also the probability that the rank corresponding to |X; — Y,| is either | 
plus or minus is one-half. All possible arrangements of plus or minus in 
relation to the № ranks may be considered. There are 2" such arrange- 
ments. These are viewed as equiprobable. Я, is calculated for each 
arrangement and the frequency distribution of W4, becomes the null dis- 
tribution against which particular values of W+ are evaluated. To illustrate, 
for N = 3 the number of arrangements of plus and minus is 23 = 8. These | 


are as follows: ( 


22.6 


next smallest, an 
adopted is to assign to t 


э 


W., is evaluated is obtained in 
Ү are samples from the same 


[22.9] 


NONPARAMETRIC STATISTICS 


1 2 3 W. 
+ + + 6 
— + + 5 
+ - + 4 
+ + - 8 
= = + 8 
+ - 9 
фо = d 
— == о 


Eight different values of ІР; occur, ranging from 0 to 6. In general W, will 
range from 0 to N(N + 1)/2. It can be shown that the mean and variance of 
the distribution of ІР, are as follows: 


Mean = p, = NND 1) 


3 N(N + 1) (2У +1 
Variance — Су, 2 = NO DON +1) 


The distribution of W, approaches the normal form fairly rapidly. Con- 
sequently for samples of reasonable size the normal approximation may be 
used. The normal deviate z is given by 


ge, NON +1) 


"= NNF GN +1) 
24 


Values of 1.96 and 2.58 are, as usual, required for significance at the 5 per 
cent and 1 per cent levels for a nondirectional or two-tailed test. 

For N up to 40, values of ІР, may be referred to Table I of the Appendix. 
This table provides the lower-tail cumulative probabilities for values of Ws. 
that lie close to the selected significance levels, .05, :025, .01, and .005. То 
illustrate, for № = 10 and a.significance level of 
shown: 10 with an associated probability of .0420 and 11 with a probability 


orless than 10, when the null hypothesis is true, is -0420, and the probabil- 
ity of obtaining a value of. ІР, equal to or less than 11 is .0527. This is a 
directional test. It uses the lower tail of the distribution. A directional test 


symmetrical, the corresponding upper-tail value is [N(N + 1)/2] — №... 
This quantity is equal in absolute magnitude to the sum of the negative 
ranks | |. To illustrate, suppose N = 10 and W, = 50. What is the prob- 


22.1 


NONPARAMETRIC TESTS OF SIGNIFICANCE 331 


ability of obtaining a value of W, equal to or greater than 50 when the null 
hypothesis is true? Неге we calculate [N(N + 1)/2] — W+ = 10 x 11/2 — 
50 = 5. The value 5 is referred to Table I and a probability of .0098 is 
obtained. Thus under the null hypothesis the probability of obtaining а. 
value of W, equal to or greater than 50 is in the neighborhood of .01. For E 
directional or two-tailed test we refer either W, or [N(N + 1)/2] — №. | 
whichever is the smaller, to the table and double the probabilities. 3 

The discussion above is somewhat involved. In practice a simple proce- 
dure may be used. First, calculate W, and |#/_|. Second, take the smaller 
E ea two quantities and refer this to Table I with the appropriate value | 
of N. ' ; 

The following are paired observations, Х and У, for a sample of 10 indi 
viduals: 


і 
X 15 19 31 36 10 11 19 15 10 16 : 
4 19 30 26 8 10 17 13 22 E 
d -4 11 5 28 Омь bes 2 2 --12 8. 
Как -3 —7 4.5 9 4.5 1.5 1.5 — 8 


Values of d are calculated. One pair of observations is tied and is БЕСІ 
from subsequent consideration. The d’s are rank-ordered by absolute NE 
nitude. The lowest values are a pair of 2's. These are assigned rank values | 
of 1.5. The sum W4, the sum of positive ranks, is 27. The value [N(N + 
1)/2] — V, =9 x 10/2 — 27 = 18. Note that |W_|, the absolute value of 
the negative ranks, is also 18. This value 18 is referred to Table I. No basis | 
exists in this case for rejecting the null hypothesis for either a directional or 
a nondirectional test. ^d 

The asymptotic relative efficiency o! 
tive to the t test is .955. This test сап be view 
alternative to the 2 test. 


f the Wilcoxon signed-rank test rela- | 
ed as a useful and satisfactory- 


р. 


A RANK TEST FOR k INDEPENDENT SAMPLES 
samples is the Kruskal-Wallis (1952) one-way | 


analysis of variance by ranks. This is a generalization of the Wilcoxon rank | 
sum test to А groups. The null hypothesis is that the kindependent samples | 
of ny, ng > - + > Пк members are from the same population. To apply the 
test all the observations for the k samples are ranked. The lowest value is | 
assigned a rank of 1, the next lowest 2, and soon. The sum of the ranks, Ri 
for each of the k samples is obtained. If all k samples are from the same 
the expectation is that the mean rank sums Kj will be equal for 
ean of the М ranks, which is (N + 1)/2. 


The null distribution here involves a consideration of the М! arrange- » 
ments of ranks іп k groups. Each arrangement is regarded as equiprobable. ` 


А rank test for k independent 


population, 
the k groups, and equal to the m 


32 


[22.10] 


[22.11] 


[22.12] 


[22.13] 


NONPARAMETRIC STATISTICS 


For each arrangement the following statistic could be calculated: 


Е ы _М+1} 
з=} (к, +) 


The quantity in parentheses in the above expression is simply the squared 
difference between the means of the ith group and its expected value under 
the null hypothesis. Since the groups may be unequal in size, each squared 
difference is weighted according to group size, n;, to obtain a final sum of 
squares, S. S could be calculated for each of the №! arrangements of ranks 
and its frequency distribution examined. Some advantages attach to the 
study of a statistic H, which is closely related to S and is given by: 


= BS 12 «р NI 
нуту уту м |К 2 ] 


The distribution of H approximates the distribution of chi square with 
k — 1 degrees of freedom. For К = 3 and n, = 5, tables of critical values of 
H with exact probabilities have been prepared by Kruskal and Wallis. For 
larger values of k and n; the chi-square approximation must be used. For 
computational purposes it is convenient to write H in the form 


12 к, ҮК? 
NO xD > (6) -w+ 


where R; is the sum of ranks for the kth group. 
When ties occur, the usual convention is adopted of assigning to the tied 


observations the average of the ranks they would otherwise occupy. The 
value of H is then divided by 


ET 


1-7 


where T = t? — t, and t is the number of 


tied observations in a group. The 
quantity H corrected for ties is 


12 А [RAL 

лт > (#)- + 
я ын. 
N—N 


The correction for ties will increase the value of H. 
The following are data for three samples: 


Sample 1 3 t № 16 22 2 з 36 
Sample II 3 4 ? 18 19 32 
Sample Ш 22 38 46 47 47 50 53 54 56 


22.8 


NONPARAMETRIC TESTS OF SIGNIFICANCE 333 ` 


In this example n, =8 
: 1-8, п. = 6, пз=9, р - 3 
observations are ranked to оба. —( Ot sae 


Sample I 1.5 45 6 7 10.5 12 13 15 


Sample П 15 3 45 558 9 14 
Sample Ш 10.5 16 17 18.5 18.5 20 21 22 23 


The sums of ranks are calculated. Th 

e Ё ese аге R, = 69.5, К, = 40 

Кз = 100521 We note that we have four sets of ties of two ужы S 
each. T = 23 — 2 = 6, and for the four sets ХТ = 24. The value of H is then 


12 69.5? 40? 166.5? 
( +24 ) 3(23 +1) 


дев шу ишш ыш" 
m = 13.88 


1~ 338 — 23 


In this example the effect of the correction for ties is negligible and may for 
all practical purposes be ignored. On reference toa table of x? with pus 23 
we note that an H of 13.88 is significant at better than the 1 per cent leval 
We may then reject the hypothesis that the samples are from the same pore 


ulation. 
ymptotic relative efficiency of .955 . 


The Kruskal-Wallis test has an аз: 
when compared with the F test resulting from the analysis of variance. 
applied to independent samples. For k = 2 this test is equivalent to the Wil- 


coxon rank sum test. 


ATED SAMPLES 

A rank test for k correlated samples is the Friedman two-way analysis of ) 

variance by ranks (1937). The data are a set of k observations fora sample | 

of N individuals. Such data arise in many experiments where subjects are 

tested under a number of different experimental conditions. The corre- | 
for two-way classifica- 


sponding parametric test is an analysis of variance 
tion where observations are made on each of a group of individuals under % 


more than two conditions. If there is reason to believe that the assumptions | 
underlying the analysis of variance are not satisfied by the data, Ше | 


Friedman rank method is appropriate. 

The data are arranged in a table containing М rows and Ё columns. The 
rows correspond to individuals, or groups, and the columns to experimental ( 
conditions. Table 22.1 shows such an arrangement of data for eight sub- 
jects tested under four experimental conditions. The observations inthe | 
rows are ordered as shown in Table 22.2. For example, the four observa- 
tions in the top row 


are 4,5, 9. and 3. These are replaced by the ranks 2, 3, 
4, and 1. The ranks in each column are summed. If the samples are from 
the ranks in eac 


the same population, h column will be a random arrange- 
ment of the numbers 1,2, 3, and 4. Under these circumstances the sums of 


A RANK TEST FOR k CORREL 


334 


Table 22.1 


[22.14] 


Table 22.2 


NONPARAMETRIC STATISTICS 


Material recalled after four time intervals for a group of eight subjects 


Time interval 


CA 


Subject И HI IV 


1 4 5 9 3 
2 8 9 14 7 
3 7 13 14 6 
4 16 12 14 10 
5 2 4 1 6 
6 1 4 5 Б] 
T7 2 6 гі 9 
8 5 7 8 9 


ranks for columns will tend to be the same. If these sums differ signifi- 


cantly, the hypothesis that they are from the same population may be 
rejected. 


The null distribution here involves a consideration of the 4! arrange- 
ments of ranks in any row. These are considered equiprobable. Given № 
rows, the number of possible equiprobable arrangements of ranks is (А1)^. 
For each of these arrangements a statistic S may be calculated, where 


5= 5 (Rı-R} 


R; is the sum of ranks for the ith column, and R is the mean rank sum. S is 
simply the sum of squares of rank sums about the mean rank sum. If the 


Ranks assigned by rows for the data of Table 22.1 


Time interval 


Subject T H III IV 
аш ач... 
1 2 4 1 
2 2 Б] 4 1 
3 2 3 4 1 
4 4 2 3 1 
5 L 2 4 3 
6 1 3 4 2 
7 1 2 3 4 
8 1 2 3 4 
R; 14 20 29 17 


[22.15] 


[22.16] 


22.9 


PARAMETRIC TESTS OF SIGNIFICANCE 335 


samples are for the same po ulation, ion i 
equal and the expected value ofS is eder uade e ae 2 pe dE 
a frequency distribution may be made of the (k!)* "e of sm n 
tribution may be used to evaluate particular values of S. If the ет. i 
associated with a particular value of S is small, the null hypoth otia E К 
rejected. For small values of k and N the exact distributions of S are M 
Bradley (1968) provides a table of exact critical values of S for k= 3 and N M 
up to 15, and for = 4 and N up to 8. К) 
For values that lie outside the tabled values of S it is customary to use | 
statistic which is a function of S. This statistic is given by ^u 


125 
Nk(k+ 1) 
This statistic has an approximate 


degrees of freedom. 
For computational purposes а 


2 


Xr 


chi-square distribution with k— 1 


more convenient way of writing x,* is — 


2n 12 S Re 3N( 
хатту AU 3N(k+ 1) 


For the data of Table 22.2 we have 
122 (ug 208 + 29 + 17°) — 3 X B(4 + 1) = 9.45 р 


са 


2- 
Xr = Bx 4(4 + 1) 
— 1 = 3 falls between the .05 and .01 levels of signifi- 


cance. Actually it is a little above the 2 per cent level. If this level of con- 
fidence is acceptable, we may conclude that the samples are not drawn y 
from the same population and that a difference in the experimental condi- 
tions is exerting an effect. In this example S — 126. If this is referred to a 
table of exact critical values of S, as given in Bradley (1968), the associated | 
probability is found to fall between .01 and .05, not far from the .02 level. T 
The chi-square approximation is in close agreement with the more exact | 
the Friedman test relative to the F “i 
ne observation per 


This result for df= 4 


test. 
The asymptotic relative efficie 


test resulting from a two-way anal 


cell is 


(бе) 
т/ДЕЗІ 
The efficiency of the test increases 


for = 2 to a maximum of .955 for k 
lated samples. 


ney of 
lysis of variance with o 


as А increases, and extends from .637 
= оо, For k= 2 this test is the same as | 


the sign test for corre 


н INDEPENDENT SAMPLES 


lysis for parametric data were described. | 
be used. These are the non- | 


i 


OTONIC TREND TEST FO. 


hods of trend ana 
d analysis using ranks may 


MON 


In Chap. 19 meth 
Methods of tren 


336 


NONPARAMETRIC STATISTICS 


parametric analogues of the tests described in Chap. 19. A test of mono- 
tonic trend is the nonparametric analogue of a test of linear trend. This 
section describes a simple test of monotonic trend for independent 
samples, which employs the statistic S as used in the definition of Kendall’s 
tau (see Chap. 21). 

Comment on the concept of monotonicity is appropriate here. A function 
Y — f(X) is said to be a monotonic increasing function if any increment in 
X is associated with an increment in Y. Also, if any increment in X is 
associated with a decrement in Y, the function is said to be a monotonic 
decreasing function. The magnitude of the increment or decrement in X 
associated with the increment or decrement in Y is irrelevant to the 
concept of a monotonic function. Monotonicity is an order concept. 

À common type of experiment is one in which А treatments are applied to 
k independent groups of п, п, ... , ny members, and a measurement 
obtained for each member. Such data are frequently analyzed using the 
analysis of variance for one-way classification. The nonparametric ana- 
logue of this is the Kruskal-Wallis one-way analysis of variance by ranks, 
described in Sec. 22.7, If, however, the А treatments exhibit an order, the 
question of monotonic trend may be raised. In effect, this question is 
answered by testing for significance the correlation between the ranks cor- 


responding to the n, + п + > + +n, = М measurements and the ranks 
for the k treatments. The ranks for treatments consist of k sets of tied - 
members with n, nj, ... , пк members respectively, in each of the k 


sets. Thus for А = 3, and ni = ns = пу 
sist of three sets of 10 tied values. 


To illustrate, the following are measurements for three independent 
samples obtained under three ordered treatments. 


= 10, the ranks for treatments con- 


Sample I 3 T 11 16 22 29 31 36 
Sample II 3 4 7 18 19 32 
Sample Ш 22 38 46 47 47 50 53 54 56 


These values are ranked as in 
variance by ranks. The values l, 2, and 3 are assi 


1 1 1 1 1 1 1 
Sample I 
Е 15 45 6 7 105 19 B 45 
T 2 2 2 2 g 
Sample L5 4.5 8 9 м 
за 3 3 зза 3 
Sample III 


10.5 16 17 185 18.5 20 21 22 23 


22.10 


NONPARAMETRIC TESTS OF SIGNIFICANCE 337 


таер rank procedure for tied ranks could, of course, be applied t: 
x E c which case the values 4.5, 11.5, and 19 would ee the 
values 1, 2, and 3. No advantage attaches to thi: : i Е 
- is. In practical computati 
the X variable need not be recorded at all. The X variable is down led 
re to illustrate the fact that the problem is a simple correlational one. 
value of S as used in the definition of Kendall’s tau, and described in 


Chap. 21 of this book, is calculated. The value of S in this example is as 


follows: 
5$5=14+10+9+9+4+3+3+1+9-+9-+9-+9-+9-+17= 105 


The value 14 is obtained by comparing the initial У value of 1.5 for sample I 
with all Y values for samples II and Ш. The Y value of 1.5 for sample I need 
not be compared with other Y values for that sample because all values in 
sample I are tied on X. 

The sampling variance of S is obtained from formula [21.14]. In this 


example М = 23. The X variable contains three groups of ties—one group 


of eight ties, one of six ties, and one of nine ties. The Y variable contains 


four groups of tied pairs. The varianc 
1,245.25, and the standard error is 35.28. Subtracting unity as a continuity 


correction and assuming normality of the sampling distribution, we obtain 
the normal deviate z — 104/35.28 — 2.95, which warrants rejection of the 


null hypothesis. 
The steps inv 
follows: 


olved in this procedure may be stated in summary as 


1 Rank all the observations fro! 
ysis of variance by ranks, substituting 


Kruskal-Wallis one-way anal 

average ranks for tied values. Use the ranks for the order of treatments 
as the X variable, which will consist of as many sets of ties as there are 
treatments. 


Calculate S in the usual way. 
S by applying formula [21.14]. 


Calculate the sampling variance of 
d divide this by its 


from S as a continuity 
he normal deviate z. 


Subtract unity correction, an 
standard error to obtain t 


ок CORRELATED SAMPLES 
This test may be applie 4 by making measurements on N 
subjects, each under k We proceed by replacing the 
original measurements by ranks as in the Friedman two-way analysis of 
variance by ranks described in Sec. 22.8. The statistic 5, as used in the def- 
ion of Kendall’s tau, is calculated for each of the N subjects. The values 
over the N subjects to obtain ES. The quantity ES is a 

f the increase, or decrease, 


trend. It is descriptive 0! 
ith increase in the treatment variable. 


MONOTONIC TREND TEST Е 
d to data obtaine 
ordered conditions. 


init 
of S are summed 
measure of monotonic 


in the experimental variable w 


e of S from formula [21.14] is - 


m 1 to N on the Y variable, as in the ( 


338 


[22.17] 


NONPARAMETRIC STATISTICS 


In the absence of ties the sampling variance of S for any subject is given 
by formula [21.11] and is the same for all subjects. The variance of ES is 
the sum of the separate variances. Thus сұ? = Xo? = №2. Assuming the 
normality of the distribution of XS, the normal deviate zis % 

_ $5 


gc. 
Ors 

The normal deviate z takes the usual critical values 1.96 and 2.58 at the .05 

and .01 levels, respectively, for a nondirectional test. In many experimental 

situations in which trend tests are used, some prior basis exists for 

predicting the direction of the trend; consequently a directional test will 

frequently be appropriate. 

Estimates of probabilities obtained from the normal approximation to the 
distribution of XS will be improved by using a correction for continuity. To 
apply this correction, we subtract unity from XS if it is positive and add 
unity if it is negative. Thus the absolute value of XS is reduced by unity. 

The method described above is illustrated with reference to the data of 
Table 22.1, which show hypothetical measurements for eight subjects 
under four treatments. These measurements have been ranked for each of 
the № subjects, as shown in Table 22.2. A value of S is calculated for each 


of the N subjects. The values of S summed over subjects are ES as 
follows: 


5=0+0+0—4+4+2+6+6= 14 


For k = 4 from formula [21.11] the sampling variance o? = 8.67. Note 
that in this case in using formula [21.11] we write N = k, the number of 
treatment groups. The variance of cys? = 8 X 8.67 = 69.33, and oss = 
8.33. Reducing XS by unity as a continuity correction results in z — 
13/8.33 — 1.56, which falls short of significance at the .05 level. Thus we 
cannot argue for a significant monotonic trend from these data. 

It is of incidental interest to note that for k= 2 the quantity ([ES| — 
1)?/N is distributed approximately as x? with df — 1, and the quantity 
([55| — 1)/ММ has an approximately normal distribution. In this case the 
present test is the same as the usual sign test for two correlated samples. 

The steps involved in the above Procedure may be summarized as 


follows: 
Rank the scores for each subject from 1 to k. 
Calculate S for each subject. 


1 
2 
3 бит S for all subjects to obtain XS. 
4 


Calculate the sampling variance of S, 9 s^, using formula [21.11], and 
multiply this by М to obtain the sampling variance of ZS, os. The 
square root of this quantity is the standard error, 


5 Divide |ZS| — 1 by the standard error ss to obtain the normal 


deviate z. 


Table 22.3 


E r- 
NONPARAMETRIC TESTS OF SIGNIFICANCE зэ | 
6 Reject the null hypothesis for а nondirectional test at the 105 level dA ‘ 


z > 1.96, and at the .01 level if > 2.58. The corresponding values of z 
for a directional test are 1.64 and 2.33. 


^ 
К 
Y 
Ties may occur in the measurements obtained for each subject. Ties will - 
presumably not ordinarily occur in the treatment variable, the values of ) 
that variable being controlled by the experimenter. Under these circum- 
stances the value os? may be calculated separately for each subject using ) 
formula [21.13], and the values of os? summed to obtain 057. A more con- 4 
venient procedure is to calculate oys? as in the untied case, and then sub- | 

| 


tract unity from this variance for each tied pair, 3.67 for each triplet of ties, 


and so on. 
Illustrative data are shown in Table 22.3. This table shows measure- 


ments obtained for eight subjects under four conditions. The quantity £S 
has been calculated, and is found to be —25. The data contain five tied | 


Data illustrating trend test for correlated data with ties 


Measurements under four conditions 


Subject 1 п HI IV 
ЕЕ 
1 7 7 4 2 
2 6 1 5 5 
3 3 2 2 2 
4 4 2 2 2 
5 8 6 5 2 
6 4 7 2 2 
n 9 4 3 1 
8 10 4 2 2 
Ranks under four conditions 
Subject 1 I ІШ ІР 5 
RM ые ER D 
1 3.5 3.5 2 1 -5 
2 4 1 2.5 2.5 =] 
3 2 1 3.5 3.5 +3 
4 4 2 2 2 -3 
5 4 3 2 1 -6 
6 2 4 1.5 1.5 —3 
7 4 3 2 1 —6 
+ 3 1.5 1.5 —4 y 


| 


$ =—25 


340 


22.11 


NONPARAMETRIC STATISTICS 


pairs, and one triplet of ties. For = 4 without ties the sampling variance 
gs? = 8.67, and the variance of oys? =8 x 8.67 = 69.33. We subtract from 
this 5.00 for the five tied pairs and 3.67 for the triplet of ties to obtain a cor- 
rected value of the variance of 60.67 and a standard error of 7.79. Reducing 
the absolute value of XS by unity as a continuity correction results in z — 
24/7.79 = 3.08, which is significant at better than the .01 level for a non- 
directional test. 

For a more detailed discussion of nonparametric trend tests see 
Ferguson (1965). 


EXACT TEST OF SIGNIFICANCE FOR A 2 x 2 TABLE 


Ап exact test of significance for a 2 X 2 table has been developed by R. A. 
Fisher. This test enables the calculation of exact probabilities and avoids 
the use of the continuous chi-square distribution to obtain approximate 
probabilities. It may be used appropriately where the expected cell 
frequencies are small. The principal objection to its use is the laborious 
calculation required. 

In tossing a number of coins a finite number of events may result. In 
tossing six coins, seven outcomes are possible. We may toss 0, 1,.2, 3, 4,5, 


or 6 heads. The binomial distribution may be used to determine the exact 


probabilities associated with these seven outcomes, Similarly, for any 
2 X 2 table, given the restrictions imposed by the marginal totals, a finite 
number of arrangements of the cell freq 
for the table 


uencies may result. For example, 


only four arrangements of the cell fr 


equencies are possible, These are as 
follows: 


[22.18] 


NONPARAMETRIC TESTS OF SIGNIFICANCE 341 


The exact probability associated wi 

um i о е with each arrangemen 
C Echo mos situation here, consider ui urn edu 
€——À E = е s. Withdraw the balls one at a time and assi e 
men e eos 5 а black box and six to а white box. Count ite 
rus ee Ғы іп үз black box. Repeat the experiment many times 
oe aati ative requencies of the four possible outcomes. These 
e ies я experimentally determined estimates of the proba- 
ИШЕН pres the four possible 2 x 2 tables. The required proba- 
шн sei dra ұлы» ated without this laborious experimental procedur 

prol i y of any arrangement of cell frequencies, given th inal 
restrictions, is obtained by ан” 


_ (A+ B) (C + р)!(4 + C)! (B4 D)! 
NIAIBIC!D! 


erator is the product of the factorials of the marginal totals. The 
N! times the product of the factorials of the cell 
er,say,5,is5X4X3X2X1= 120; 
ted with the four tables above me 


The num 
denominator is 
frequencies. The factorial ofany numb 
also 0! — 1. The probabilities associa 


3!8!5!6 


.. 38156 % 
1 = 10131519733 — 2212 
__ 31815161 _ 15 _ 
2 ре түм s S 


_ ase 12 
$ љт 
31815161 2 
4 рп 3$ 008 

Total 


ting the hypothesis that 
of obtaining a degree of 
, and in the same direc- 
arrangements 3 and 4, 
bout 42 samples in 100, 


4 occur by chance. 
o а sta- 


no grounds for rejec 
nt, The probability 
rved 


Clearly, in this case we have 
the two variables are independe 
association equal to or better than the one obse 
tion, is obtained by summing the probabilities of 
This probability is .3636 + .0606 = ‚4242. Thus in а 
a result equal to or better than the one observed woul 
With the present data no arrangement of the 2 x 2 table can lead t 


tistically significant result. 


Usually the probabilities associate 


2 x 2 table need not be calculated. W 
ties associated with the observed table and those that represent more 


extreme departures from expectation in the same direction. Let Table 1 
below represent the observed data. Tables 2 and 3 are the two more 


reme tables in the same direction. 


d with all possible arrangements of the 
е need only calculate the probabili- 


ext 


342 NONPARAMETRIC STATISTICS 


The probabilities associated with these three tables are -2448, .0490, and 
:0023. The sum of these probabilities is .2961. This falls far short of signifi- 
cance, and we conclude that the evidence is insufficient to warrant rejec- 
tion of the hypothesis of independence. The sum of the probabilities 
associated with Tables 2 and 3 is .0513. Thus the arrangement of Table 2 
above, if it did occur, would fall short of significance at the 5 per cent level 
for a one-tailed test. The only arrangement of the three shown which could 


lead to a conclusion of significance, given the marginal restrictions, is that 
shown in Table 3. 


Tables to assist the application of exact tests of significance to 2 x 2 
tables have been prepared by Finney (1948). An adaptation of Finney's 
tables is given by Siegel (1956) and Bradley (1968). 


SSS XV E NEN 


EXERCISES 1 Тһе following are data for two groups of experimental animals: 


Group I 14 109 127 143 187 204 209 266 2177 
Group Ш. 62 82 89 90 101 106 109 109 205 


Apply a sign test to test the hypothesis that the two samples come 
from populations with the same median. 


2 The following are data for a sample of nine animals tested under 
control and experimental conditions: 


Control 21 As 265 39 55 82 46 55 88 
Experimental 18 9 23 26 82 199 42 30 62 


Test the significance of the difference between the two medians using 
a sign test. 


3 The following data are for four groups of experimental animals: 


NONPARAMETRIC TESTS OF SIGNIFICANCE 343 


Group I 5 7.7 56h 147719 
Group II Е «315 18 2 2 
Стоир Ш if 2). 2m 25 0329 
Group IV 23 21 28 31 32 


Apply а sign test to test th i 
е hypothesis that the four sa 
from populations with the same median. ge 


4 Apply the Wilcoxon rank sum test to the data ‘of Exercise 1 above 
5 Apply the Wilcoxon matched-pairs si : 
-pairs signed-rank 
vi eet е gned-ranks test to the data of 
6 Apply a Kruskal-Wallis one-wa i i 
-way analysis of variance b k: 
data of Exercise 3 above. АК 


7 Apply а Friedman two-way analysis of variance by ranks to the follow- 


ing data: 
Treatment 

EE 
Subject I I ІП IV 
i CS cnn 

1 5 9 4 1 

2 6 8 1. 3 

3 9 10 8 Т 

4 5 10 4 2 

5 8 6 4 1 

6 10 8 7 5 

7 14 12 13 10 
bnet 


8 Applya monotonic trend test to the data of Exercise 3 above. 
9 Applya monotonic trend test to the data of Exercise 7 above. 
10 Obtain the exact probabilities associated with all possible arrange- 
ments of cell frequencies for the following 2 X 2 tables: 


rrangement of cell frequencies justify a 


uld any а 
pothesis of independence? 


In either case wo 
rejection of the һу 


PSYCHOLOGICAL TEST AND 
MULTIVARIATE STATISTICS 


23.1 


23.2 


TEST CONSTRUCTION STATISTICS 


INTRODUCTION 

Many commonly used psychological tests of ability, achievement, and apti- 
tude are constructed of items which permit two categories js apti- 
only, either a pass or a fail. With such items a weight of 1 is Med 
assigned for a pass and a weight of 0 for a failure. Such items are ERN 
spoken of as dichotomous items. А person's score, X,on atest of n поши 
simply the number of items done correctly, although, as will Hee x 
sequently shown. certain corrections ог transformations may be applet : 
this score. À substantial body of theory, and statistical technique, has b 0 
developed which is concerned with the construction, mL 
accuracy of measurement, and validation of such tests. This body of A 
and technique 15 commonly spoken of as psychological test, or mental 5244 
theory. For a thorough and comprehensive discussion of this subject nd 
reader is referred to Gulliksen (1950), Magnusson (1967), and Lord dot 
Novick (1968). A few elementary aspects only of the subject are discussed 


here. 


ST ITEMS 


THE MEAN AN 
MU. uk ministered to a sample of N individuals. The 
ү of items P h individual may be obtained. These are the 
i denoted by Xi, Хэ...» Ху. In addition 


the N individuals 
f individuals passing each of the n items may be 


which may be denoted by Pas Ps, «1s te Рогат 
Pa} е 


D VARIANCE ОЕ ТЕ: 


of п items adi 


ordinarily divi 
347 


348 


23.3 


[23.1] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


tions Di, рэ, . . . ‚Ри. With tests of ability these proportions are spoken of 
as the difficulty values of the items, and are presumed to describe the dif- 
ficulty of the item. Clearly, if few people in the sample under consideration 
pass the item, say р; = .05 or p; = .10, the item is a difficult one. If a high 
proportion in the sample of people pass the item, say p; = .80 or p; = .90, 
the item may be viewed as an easy item. Note that the proportion is in- 
versely related to difficulty: the higher the proportion, the easier the item. 
Note further that the proportion р; is obtained by adding together all the 
scores of 1 and 0 for a particular item, and dividing by the number of indi- 
viduals in the sample. The quantity p; is, therefore, the arithmetic mean of 
the individual item. 

Each of N individuals obtains a score of 1 or 0 on a particular item, say 
item i, whose mean is р;. The variance of the individual item may be con- 
sidered, where in this case the variance is defined ass? =У(Х— Хум. If 
we substitute l's or 078 for the X’s, and p; for X, it can be shown that the 
individual-item variance is given by 52 = рф, where q; = 1 — р. The item 
standard deviation is given by s; = Ура. 

The difficulty value of an item, the item mean, is obviously not indepen- 
dent of the item variance. The variance is a maximum when Pi = .50, and 
52 = .50 X .50 = .25 and departs from this maximum as р; departs from 
-50. The item variance approaches zero as p, approaches 1.00 or zero. 


CORRELATION BETWEEN TWO TEST 
ITEMS: THE PHI COEFFICIENT 


In psychological test work the correlation between two test items is the 
usual product-moment correlation between two variables, where the vari- 
ables are restricted to the integers 1 or 0. This statistic is the phi coefficient 
and is applicable to 2 X 2 tables only. It is related to x2. Although it may be 
used to describe the relation between any two dichotomous variables, its 


most common application is in describing the relation between two test 
items. 


One formula for calculating the phi coefficient, or Ф, is 
" BC — AD 
V (4 *- B(C - D(A--C)(B« D) 


where A, B. C, and D are the four cell frequencies. The term in the denomi- 
nator of the above expression is the square root of the product of the four 
marginal totals. 

Table 23.1 shows a 2 X 2 table illustrating the relationship between two 
psychological test items. The value of based on this table is .376. The 
reader will note that in this example the two underlying variables may be 
regarded as continuous. The categories "pass" and "fail" may be consid- 
ered a dichotomy of an underlying continuous ability variable. Individuals 
above a certain threshold value on the ability variable pass the item; those 
below it fail the item. 


TEST CONSTRUCTION STATISTICS 349 


Table 23.1 Computation of phi coefficient of correlation between two test items 


Frequency Proportion 
Item 2 Item 2 
Fail Pass Fail Pass а 
Pass 30 
= - 
$ E 
* Fail 20 5 
(С) | (D) 
26 24 50 .52 .48 z 
(а) (з) 
19x15—11x5 = 316 


фе 
30 х 20 x 24 х 26 


The phi coefficient is related to x? calculated on a 2 X 2 table by the 


expression 

x? 
Ф= үңү 
ог 


[23.2] = №? 
Any formula for calculating x? for a 2 X 2 table may with minor modifica- 
tion be used for calculating $. | 
Alternative formulas for computing ф may be stated. Let us represent 
the proportion passing item i by p; and those failing by 4, where р; = 1 — qi. 
Similarly, the proportion passing item jis p; and the proportion failing q;. 
The proportion passing both items i andj is represented by ру. The ф coef- . 
ficient of correlation between two test items may then be written as 
pij — PiPs 
23.3] ФЗ 
; у» Pipdidi 
For the example of Table 23.1, р = .38 and the phi coefficient is 


EA 38 — -60 X -48  — 376 
b = 760 x 52 X .40 X .48 


ich checks with the result previously obtained. When one of the vari- | 
хе = .50, the formula for ф simplifies to 


ables is evenly divided, pi= 4 


_ 2ру р; 
[23.4] Ф т 
When both variables are evenly divided and pi = di = Pi~ q;= -50, the 


formula becomes 


350 


[23.5] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 
$—4p;;—1 


The phi coefficient has been widely used in statistical work associated with 
psychological tests. Usually when investigators speak of the correla- 
tion between dichotomously scored test items, the reference is to the phi 
coefficient. 

The phi coefficient is a particular case of the product-moment correla- 
tion coefficient. If we assign integers, say, 1 and 0, to represent the two 
categories of each variable and calculate the product-moment correlation 
coefficient in the usual way, the result will be identical with $. 

The phi coefficient has a minimum value of —1 in the case of perfect neg- 
ative and a maximum value of +1 in the case of perfect positive associa- 
tion. These limits, however, can be attained only when the two variables 
are evenly divided; that is, p, = q; = р; = q; = .50. When the variables are 
the same shape, p; = p; and q; = qs, but are asymmetrical, р; # q; and p; # 
4» one or the other of the limits —1 or +1 may be attained but not both. The 
maximum and minimum values of phi are clearly influenced by the 
marginal totals. Cónsider the following 2 X 2 tables: 


= 50 0 50 = 0 50 50 
50 50 50 50 
3 4 
- * - * 
(БК МЕТІ 
ІІ ШІ 
40 60 40 60 


In Tables 1 and 2 both variables are evenly divided and coefficients of +1 
and —1 are possible. Table 3 represents the maximum positive association 
possible, given the restriction of the marginal totals. The phi coefficient is 
-613. Table 4 shows the most extreme negative association possible with 
the same marginal totals. The phi coefficient is —.403. For this particular 


set of marginal totals phi can extend from a minimum of —.403 to a 
maximum of .613. 


Ж 


23.4 


Table 23.2 


TEST CONSTRUCTION STATISTICS 351 


While the influence of the marginal totals on the range of values of phi 
may in some of its applications prove to be a disadvantage, this effect is in 
no way inconsistent with correlation theory. If a correlation coefficient is 
viewed as a measure of the efficacy of prediction, then perfect prediction in 
both a positive and a negative direction is possible only when the two dis- 
tributions have the same shape and are symmetrical. If one variable is nor- 
mally distributed and the other is rectangular, perfect prediction of the one 
from the other is not possible and the correlation coefficient reflects this 
fact. Perfect prediction in one direction requires symmetry also. The phi 
coefficient, although affected by the marginal totals, is a measure of the 
efficacy of prediction. From this viewpoint it quite rightly reflects the loss 
in degree of prediction resulting from the lack of concordance of the two 
marginal distributions. 

Because x? = №ф?, we can readily test the significance of ф by referring 
Nó? to a chi-square table with 1 degree of freedom. When df= 1, x° is a 
normal deviate and we may refer ФУЛ to tables of the normal curve. In , 
sampling from a population where no association exists, the distribution of 
should be approximately normal with a standard error of 1/V/N. Of 
course, all considerations pertaining to small frequencies (Sec. 13.6) apply 
here. N should, clearly, not be too small. 


RESPONSE PATTERNS 
The responses of № subjects on a test of n items may be represented in the 
form of a table containing n rows and N columns. The elements in this table 
are Гв and 0’s. Such an arrangement of the data is spoken of as a response 
or answer pattern. Table 23.2 shows a hypothetical response pattern fora | 
test of five items administered to a sample of ten individuals. 

Difficulty values are shown in the column to the right of Table 23.2 and 
scores are shown in the bottom row. The mean score, X, is 2.80. Note that | 
X = EXIN = Ep, = 2.80. The mean score оп the test is the sum of the dif- 


Response pattern for a test of five items administered to a sample of 


ten subjects 


Individuals 
jw в) ИБ Ослу о 7720 pi 
n 
тем му See quar wu .80 
sr XN HM т Jim i 60 
Е ca Же ДЫ}! 60 
me aug Ал 40 
орт 1 40 


352 


23.5 


[23.6] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


ficulty values. The variance of scores on the test, where the variance is 
defined as 5,2 = E(X — X)?2/N, is 2.16. 


RELATION BETWEEN THE VARIANCE OF TEST 
SCORES AND THE PROPERTIES OF ITEMS 


In test construction a knowledge of the relations between the variance of 
test scores and the characteristics of test items is of interest. To ascertain 
the nature of these relations each item on a psychological test is viewed as 
being interrelated with every other item, the nature of these interrelations 
being most usefully represented by the interitem covariances. The 
interitem covariance between two items i and Jis given by пуб, = pij — 
рр). The table, or matrix, of item variances and interitem covariances may 
be represented as follows: 


51 7125152 "44% TinS15n 
7125152 s, ... TanS25n 
Tin$15n TanS25n E Sn 


Such a table of variances and covariances is called a covariance matrix. 
This matrix is symmetrical with item variances along the diagonal and 
interitem covariances along both sides of the diagonal. 

The reader may recall from Sec. 7.6 that the variance of the sum of a set 
of variables is the sum of all the elements in the covariance matrix. In the 
present context this means that the sum of all the elements in the interitem 


covariance matrix is the variance, 5,2, of scores on the test as a whole. This 
relation may be written as 


n(n-1y2 


n 
s2 => 52 +2 У "mss 
1 1j 


This formula describes the relation between the variance of the test and 
the properties of the individual test items of which the test is comprised. It 
expresses the test variance as a fun 
interitem covariances. For the illu 
for a test of five items administered 
covariance matrix is: 


ction of the item variances and the 
strative response pattern in Table 23.2 
to a sample of 10 subjects the interitem 


6 — —08 -0 — 92 og 
—.08 24 14 16-16 
-.08 14 24 6 — .06 
—.02 16 06 24 ум 


.08 :06 06 14 24 


23.6 


TEST CONSTRUCTION STATISTICS 353 


The elements in the diagonal are the item variances, 52 = рф. The ele- 
ments on either side of the diagonal are the item covariances, гб; = pij — 
pip, The sum of all the elements in this matrix is 2.16, which is the 
variance of test scores, si. 

The relation shown in formula [23.6] between test variance and the 
variances and covariances of the items of which the test is comprised has 
important implications for test construction. The purpose of any test is the 
differentiation of individuals or the description of individual differences. In 
the construction of a test an important consideration js to ensure that the 
test as a whole discriminates, or shows variation, between those individu- 
als to whom it can be appropriately administered. Clearly a test on which 
every individual made a zero score, a perfect score, or the same score, and 
for which s;? = 0, would serve no useful purpose. Test makers, therefore, in 
the development of tests attempt to ensure that the test scores have a fairly 
large variance in relation to the number of items the test contains. Formula 
[23.6] shows that the larger the variance of a particular item, the greater the 
contribution tends to be of that item to the variancé of the test as a whole. 
This has led some test makers to suggest that only items whose difficulty 
values were not distantly removed from .50, say between .30 and .70, be 
included in a test. Formula [23.6] shows also that the greater the interitem 
covariance between two items, the greater the contribution of those two 
items to the test variance. This has led test makers in their test construc- 
tion procedures to include those items which had the higher interitem 


covariances and correlations. 


INTERNAL CONSISTENCY 
Inspection of the response pattern in Table 23.2 shows that individual 4 
failed item 1, which had a difficulty value of .80. On the other hand the 
same individual passed items 2, 3, and 4 with difficulty values of .60, .60, © 
and .40, respectively. His performance on item 1 is clearly inconsistent 1 
with his performance on items 2, 3, and 4. Likewise individual 5 failed item 
2 with a difficulty value of .60 and passed item 5.a more difficult item, with 
a difficulty value of .40. This again is an inconsistency of response. If an 
individual who obtains a score X, on a test obtains this score by passing the 
X; easier items on the test and failing the n — X, more difficult items, his 
performance contains no inconsistencies. If all N individuals taking a test 
obtain their score X, in this way. and no inconsistencies are present in the 
response pattern, that response pattern may be spoken of as an internally 
consistent pattern. With real data, response patterns that exhibit no incon- , 
sistencies do not ordinarily occur. Response patterns for different tests ex- 
hibit varying numbers of inconsistencies and varying degrees of internal 
consistency. 
Although the term internal consistency has been used here to refer to а” 
particular property of a response pattern, а variety of other terms is quite 
commonly used. Ferguson (1941) used the term uniqueness to refer to a 


354 


Table 23.3 


23.7 


e PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


Internally consistent response pattern for a test of five items administered to a 
sample of ten subjects. 


Individuals 


DEUM IG UID a gs Lah d .80 
ЗО Sb uu Xd .60 
Ei Uo 1 TT 1 1 1 -60 
Sue qum эт. т 40 
523 та, %47% гү 40 
И oe батыт М 1 X = 2.80 


response pattern with no inconsistencies. Loevinger (1947) used the term 
homogeneity. DuBois (1970) draws distinctions between different types of 
test homogeneity. 

Table 23.3 shows an example of an internally consistent response pat- 
tern. The difficulty values of the items are the same as those in Table 23.2. 
Table 23.3 shows what the response pattern for the data of Table 23.2 
would have been had no inconsistencies been present. Although the item 
difficulties are the same for Table 23.3 as for Table 23.2, the test scores are 
different. The test scores for internally consistent pattern of Table 23.3 are 
more variable. The variance of scores for Table 23.3 is 5,2 = 3.90, as com- 
pared with a variance 5,2 = 2.16 of Table 23.2. The effect of inconsisten- 
cies in the response pattern is to reduce the variance of test scores, 

What would the variance of test scores be if all responses were assigned 
at random, given the restriction of the difficulty values? Under these condi- 
tions we would expect, since the responses are random, that all the 
interitem correlations and covariances would be zero. The expected value 
of the variance under these conditions is 512 = Es? = Ур: а. Thus the test 
variance is the sum of the.elements in the diagonal of the covariance 
matrix. All elements on either side of the diagonal are zero. For the data of 
Table 23.2 the expected variance, if all responses are assigned at random, 
is 5,7 = 1.12. For any given set of difficulty values the test variance may 
vary from a minimum of s- = Es? to a maximum value which is the value 


of the variance which would result if the response pattern contained no 
inconsistencies. 


THE ITEM SELECTION PROBLEM 


Test makers in the construction of ps 


ychological tests have devised 
methods for pretesting test items. Not in 


frequently a preliminary form is 


TEST CONSTRUCTION STATISTICS 355 


prepared which contains a much larger number of items than will b 
included in the final test. If a test of 50 items is required, the teat makai 
may begin with an initial collection of 100 items. These are administered on 
a trial basis to a sample of the population on which the test will ultimatel y 
be used. How does the test maker decide which are the better 50 items E. 
the 100 items available? What considerations lead to the selection of one 4 
item and the rejection of another? І 
In some cases external criteria are available. Such criteria may be y 
measures of job performance, average grades following one year at the uni- a 
versity, or other indices of performance. Each test item may be correlated. ў 
with the criterion variable, and the correlation coefficients used in the | 
selection of items. Presumably, if the ultimate objective is to construct a р 
test to predict the criterion variable, those items would be included in the 1 
final test that have a high correlation with that criterion. Items with a low 
correlation with the criterion would be rejected. At times criterion groups 
are available. The criterion may be membership in а top, middle, or low | 
e; a group of psychotic patients and a group ОЕ 
f individuals who complete a four-year university - 2 
not. Неге again correlation coefficients, or other 3 


group on job performance 
normal subjects; groups о 


course and those who do 
statistics, may be used to describe how the test items discriminate between’ | 


о external criterion against which the test items can р 


groups. 

In many situations n 
be validated is availabl 
consistency is commonly 
another selects a subset о 


e. Under these circumstances a criterion of internal | 
used; that is, the test maker by one means or 
f items that exhibits fewer inconsistencies in its | 
response pattern than many of the other subsets that might have been | 
selected. The problem may be stated in this way. For any group of n items a ‘ 
subset А may be selected in C,” possible ways. Ву what method may a a 


of k items be selected from the С," possible sets such that the internal con- 
sistency for the set of k items selected is greater than the internal Cons 
tency for any of the С," — 1 remaining sets? Many methods of item selec- | 
tion in common use, which are applied in the absence of an external crite- | 
rion, provide approximate solutions to this problem. The subset of items 
selected is ordinarily not the most internally consistent subset, but is one of 


the more internally consistent subsets. Although an exact solution to the 


item selection problem 
very laborious except 


Although a great many me 
monly used involves calculating the correlations between the test items | 
a 


and scores on the whole test. Since the items are dichotomously scored and 
scores on the whole test may be viewed for all practical purposes as contin- 
uous, a form of correlation is required which describes the relation 
between a dichotomous, or two-categoried, variable and a continuous vari- 
able. The forms of correlation used for this purpose are the point biserial | 


Jation and the biserial correlation. 


$ 


as stated above is possible, its application would be | 
for quite small sets of items. А 


thods of item selection exist, the most com- | 
| 


қ 


corre: 


356 


23.8 


[23.7] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 
POINT BISERIAL CORRELATION 


Point biserial correlation provides a measure of the relation between a con- 
tinuous variable, such as scores on a test, and a two-categoried, or dichot- 
omous, variable, such as “pass” or “fail” on a psychological test item. The 
data when arranged in the form of a frequency distribution compose a 
table comprised of R rows and 2 columns. Although commonly used as a 
measure of the correlation between test scores and test items, point 
biserial correlation may be used in other situations as well. For example, 
the continuous variable may be scores on a psychological test, and 
the dichotomous variable may be male or female, or high school graduates 
and university graduates, or a group of normal persons and a group of 
neurotics. 

Point biserial correlation is a product-moment correlation. If we assign a 
1 to individuals in one category and a 0 to individuals in the other and 
calculate the product-moment correlation, the result is a point biserial 
coefficient. Weights other than 1 and 0 may be assigned to the categories. 
The coefficient is not dependent on the weights assigned. 

The formula for point biserial r is 


Гры = 


In this formula s, is the standard deviation of scores on the continuous 
variable, defined as У(Х — X)2/N. If the continuous variable is a test, Sy is 
the standard deviation of test scores. The quantities p and q are the propor- 
tions of individuals in the two categories of the dichotomous variable. If the 
dichotomous variable is a test item, p is the proportion of individuals who 
pass the item and q is the proportion who fail. X, and X, 
scores on the continuous variable of individuals within the two categories. 
Again, if the continuous variable is a set of test scores, X, is the mean score 
of those who pass the item and X, is the mean score of those who fail. 

The calculation of point biserial correlation from ungrouped data is illus- 
trated in Table 23.4. This table shows hypothetical scores on à test, and on 
a test item, for a group of 14 individuals. The mean score, X, 


« are the mean 


individuals making the six high 
correlation would assume a m 


r ; 


Table 23.4 


[23.8] 


TEST CONSTRUCTION STATISTICS 357 


Calculation of point biserial correlation from ungrouped data 


1 6 0 
? 8 1 
Б] 8 0 
4 п 0 
5 16 1 
6 25 0 
7 27 0 
8 31 0 
9 31 1 
10 39 0 
ll 44 0 
12 50 1 
13 56 1 
14 68 1 
Е ии 


Mean score Гог those who pass 38.17 
Mean score for those who fail: Х. = 23.88 


з, = 18.19 pal = ЕН 


14 1 
MOERS 


and “ҒаШ” were arranged at random in relation to test scores, the point 
biserial correlation would have an expected value of zero. 
An alternate method of calculating point biserial correlation is the 


formula 


LA |р 
"HC в; Nig 


where X is the mean of all scores on the continuous variable, and Sz, X. rJ 
and q are as defined in formula [23.7]. А 
Point biserial correlation is not independent of the proportions in the two 
categories. When p = 4 = .50, its maximum and minimum values will differ 
from those which would be obtained when, say, p — .20 and д = .80. The 
maximum value of гы never reaches +1; the minimum value never. 
reaches — 1. In predicting а two-categoried variable from a continuous vari- 
able, perfect prediction is possible and occurs when the two frequency dis- | 
tributions do not overlap. Perfect prediction of a continuous variable from 
a two-categoried variable is obviously impossible. Some error in prediction ` 
must always occur in predicting a variable which may take a wide range of 
values from a variable which may take two values only. The point biserial 


358 


[23.9] 


23.9 


[23.10] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


correlation coefficient reflects this fact. It is worth noting here that the 
regression line obtained by calculating the means of the two columns is of 
necessity linear, there being only two points. The regression line obtained 
by calculating the means of the rows cannot be linear except under certain 
special circumstances. 

To test the significance of Гры from zero the situation may be treated as 


one requiring a comparison of the two means X, and X,. The appropriate 
value of ¢ may be written 


N=2 
t= Гры, 1 F 


The number of degrees of freedom is № — 2. This is a two-tailed test. For 
large N the quantity 1/VN may be used as the standard error of т, in 
testing the significance of the difference from zero. 


BISERIAL CORRELATION 


Biserial correlation is an estimate of the relation between a continuous 
variable and a dichotomous variable, it being assumed that the variable 
underlying the dichotomy is continuous and normal. If a bivariate table 
comprised of R rows and C columns is dichotomized and reduced to a table 
of R rows and 2 columns, biserial correlation will provide a more accurate 
estimate of the correlation based on the R X C table than point biserial cor- 
relation. Biserial correlation is Sometimes used in place of point biserial 
correlation in correlating test Scores with individual test items. The 
assumption here is that "pass" and “Чай” оп a test item represent the 
dichotomy of an underlying normally distributed ability variable. 
The formula for biserial correlation is 


=== № ра 
bi Sz y 


Here sz is the standard deviation of test scores, defined as for point biserial 
correlation; X, and X, are the mean test 


normal curve, Table A of the Appendix, we can ascertain that the height of 
the ordinate y at the point of dichotomy is .348. 

For the data of Table 23.4 we may, 
the pass-fail dichotomy is a division 
variable. For these data p = .43, q = .57. The height of the ordinate of the 
unit normal curve at the point of dichotomy is у= .393, XY, = 38.17, X, — 
23.88, s, = 18.19, and 


38.17—23.88 43х57 _ 
1819 ~~ 393 О 


гы = 


” 


[23.11] 


[23.12] 


[23.13] 


23.10 


TEST CONSTRUCTION STATISTICS 359 


An alternate formula for biserial correlation is 


Х,-Хр 


р 
Ri ee 
Se Уу 


where X is the mean score on the test for the total sample. Applying this 
formula to the data of Table 23.4 we obtain, as before, гы = .49. 
Theoretically, the maximum and minimum values of rp; are independent 


of the point of dichotomy and are —1 and +1. Ап implicit assumption | 
statistic is that the continuous many-valued variable is- 


underlying this 
normal, as well as the variable underlying the dichotomy. Values of гы 


greater than unity can occur under gross departures from normality. 
Some difficulties surround the sampling distribution of гы. The standard 
error of гы in sampling from a population where the correlation is zero is 


roughly 
1 


ie 
When М is large, this formula may be used with reference to the normal 
curve to test the significance of гы. It should, however, be used with 
caution, because the probabilities thereby obtained are somewhat inaccu- 
rate. The standard. error tends to increase with the extremeness of the 
The reader may wish to compare the standard error of гы with 
the corresponding large-sample formula for the ordinary product-moment 
correlation, 5, = 1/VN. The standard error of rp; is always larger than the 
standard error of the ordinary product-moment correlation. Where p =q = 


dichotomies. 


.5, the standard error of ты is 1.25 times as large as the standard error of r: | 


Where р = .90 and а= .10, the standard error of гы is 1.71 times that of r. 
The relation between biserial and point biserial correlation is given by 


the expression 
гы = Г, PY 

bi pbi y 

The factor Урд [у varies from 1.25 where p — q — .5 to 3.73 where p — .99 
.01. Thus гы is always greater than Гры, and the difference 


and а= 
increases with extremeness of the dichotomies. 


CONTRIBUTION OF ITEMS TO THE TEST VARIANCE 


The proportion of individuals p: 
measure of item difficulty. Poin 
discussed as measures of the discrimin 
ficulty and discriminatory capacity are dual criteria in item selection. 
Because we have two criteria, a problem arises. Is item i with p; — .50 and 


гры = -40 a better item than item j with p; = .40 and ғы = -50? The relative 


assing an item, p; has been discussed as a 
t biserial and biserial correlation have been 


atory capacity of test items. Dif 


360 


[23.14] 


[23.15] 


[23.16] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


importance of item difficulty and item discriminatory capacity in the selec- 
tion of items has been viewed as a problem by some test makers. 

This problem has a simple solution which involves combining difficulty 
and discriminatory capacity into a single index. This index is the contribu- 
tion of the individual item to the variance of test scores. The presumption 
here is that in the construction of tests we wish, as it were, to acquire or 
capture variance. An item that contributes more to the total variance is 
presumed to be a better item than one that contributes less. 

In Sec. 23.5 the total test variance was shown to be the sum of all the ele- 
ments in the interitem covariance matrix. The sum of a column, or row, in 
the covariance matrix may be viewed as the contribution of the individual 
item to the total variance. This is one way only of defining the contribution 
of the item to the total variance. For the illustrative numerical example of a 
covariance matrix in Sec. 23.5 the column sums for the five items are as 
follows: 


Item 1 2 3 4 5 
Sum .06 .52 .42 .58 .58 


The total variance is 5,7 = 2.16. Item 1 is viewed as contributi 
item 2, .52; item 3, .42; and so on. 

It can be readily shown that the contribution of item i to the total test 
variance is the item-test covariance, Tiz5i$z, Where rj, is the point biserial 
correlation coefficient. The total variance is the sum of the covariances for 
the n items, that is, 


ng .06 to this; 


n 
55 = Ў TizSi$r 

ізі 
If we multiply the formula for the point biserial correla 


tion coefficient by 
5152, We obtain the item-test covariance, which is 


TisSiSe = (Xy — Xq)pq 


Thus the contribution of the individual item to the total variance is, very 
simply, the difference between the means of those who pass and those who 
fail the item multiplied by pq. 

In the covariance term TirSi$, Sy is the same fo; 


r all items. This circum- 
stance has led to the use of Tixsi, instead of r;,s;s,, as an index of item selec- 


tion. The quantity г; „5; has been referred to by some writers as the reliabil- 
ity index. If we divide formula [23.14] by s+, we note that 


n 


5: = У Tixsi 


n 
Whereas the sum ofthe n terms > Tiz5;S,is the test variance, the sum of then 


n 
terms У) rizs; is the test standard deviation. 


EXERCISE 


TEST CONSTRUCTION STATISTICS 361 


The values of rizs; are slightly more difficult to compute than those of 
TiaSiSz, Since Tirsi = (Xp — X,)pqls,. The statistic rizs;s;. or its simple 
modification r;,5;, is probably the most useful single index for the selection 
of items in common use. 


S —- —-— 


1 The following is the response pattern for a test of five items adminis- 
tered to a sample of 12 subjects. 


Subjects 
pom SIM и P, 
SS m 
il? fe Dia ЫЕ р м 1 9 
. 25 1 1 1 1 1 1 1 8 
S EH в Se 6 
"oye т 1.” 4 
$1 1. 1 3 
xd dd АС 2 2 I 1 Х-250 


Test scores, Xj, are shown along the bottom of the table. The number of 
individuals passing each item, Рі, are shown to the right. Calculate (a) 
the item means and (5) the item variances. 


2 Calculate (a) the interitem covariances and (0) the intercorrelations for 
the data of Exercise 1. 


3 Ifthe response pattern of Exercise 1 had contained no inconsistencies, 
what would the variance of scores have been? 


4 Ifthe responses in the rows of the response pattern of Exercise 1 had 
been assigned at random, what would be the expected variance of 


scores? 


5 Calculate the point biserial correlation between the individual items 


and the total test scores for the data of Exercise 1. 


6 For the data of Exercise 1 calculate the biserial correlation between 


item 1 and scores on the whole test. 


т Forthe data of Exercise 1 calculate the contribution of each item to the 


total test variance. 


8 Forthe data of Exercise 1 calculate the contribution of each item to the 


total test standard deviation. 


362 


24,1 


[24.1] 


ERRORS OF MEASUREMENT 


THE NATURE OF ERROR 


The measurements obtained in the conduct of experiments are subject to 
error of greater or less degree. In measuring the activity of a rat, the 
intelligence of a child, or the response latency of an experimental subject, 
we may assume that the individual measurements are subject to some 
error. In general, the concept of error always implies a true, fixed, stan- 
dard, or parametric value which we wish to estimate and from which an 
observed measurement may differ by some amount. The difference 
between a true value and an observed value is an error. If we represent a 


particular observation by X; the true value which it purports to estimate by 
Т), and an error by e, we may write 


e; X, — T, 


where e; may take either positive or negative values. 

À distinction may be made between systematic and random error. Obser- 
vations which consistently overestimate or underestimate the true value are 
subject to systematic error. A stopwatch which underestimates time 
intervals will yield observations with Systematic errors. А random error 
exhibits no systematic tendency to be either positive or negative and is 
assumed to average to zero over a large number of. subjects or trials. 
Random errors are also assumed to be uncorrelated both with true scores 
and with each other. The discussion in this chapter is concerned exclu- 
sively with random errors, 

Any definition of error as the difference between an observed and true 
value is meaningless unless a precise definition is attached to the concept 


і 
à ERRORS OF MEASUREMENT 363 . 
of true value. In theory a true value is sometimes conceptualized as the 

mean of an indefinitely large number of measurements of an attribute 

made under conditions such that the true value remains constant, and the - 
procedures used in making the measurements do not change from trial to 
trial in any known systematic fashion. In mathematical language the true 


value may be defined as 


Ki se 


T,= lim =— 
e К 
where X; refers to the jth measurement. Thus the true value is the limit 


he arithmetic mean as the number of repeated observa- | 
tions K is increased indefinitely. This concept of true value is appropriate 
for the measurements of physical quantities. For example, a yardstick may 
be used to measure the length of a desk. The measurement procedure may Е 
be repeated many times, and the variation in the observations attributed to — 
error. It may be assumed that a considerable number of repeated observa- 
tions may be made under fairly constant conditions, neither the desk nor | 
the yardstick changing in any systematic way. By increasing the number of * 
observations and taking their mean, the error in estimating the true value 
may be reduced. Theoretically, this error may be made as small as we like 
by increasing the number of observations. As the number of observations 
becomes indefinitely large, the mean approaches the true value as a limit. 
Questions may be raised about the appropriateness of this concept of 
true value in the measurement of psychological quantities. Clearly, in the 
measurement of human behavior the making of a large number of repeated | 
observations is usually not possible. The attribute being measured may 
fluctuate or change markedly with time, or the process of repeated | 
measurement may modify the attribute under study. For example, in 
measuring the intelligence of a child, it is obviously out of the question to 
administer the same intelligence test 100 times to obtain an estimate of 
error. Quite apart from the labor involved in such estimation, the results | 
obtained would be invalidated by practice, fatigue, and other effects. This — 
circumstance has given rise in psychological work to a variety of proce- 
dures for estimating error other than by a series of repeated measure- 
ments. Despite the operational impracticality in psychology of estimating | 
error by making a large number of repeated measurements, the concept о 
true score as the mean of an indefinitely large number of such measure- | 
ments is still an important concept in the study of errors of measurement.. і 
Here we note that the role of true score is analogous to that of population 
parameter in sampling statistics. The difference between the sample sta- 
tistic and the population parameter is a sampling error. By increasing 
sample size the magnitude of sampling error is reduced. For an infinite 
population an unbiased sample statistic will approach the population 
as the sample becomes indefinitely large. A sampling 
ted with a statistic based on a sample of observa- | 


approached by t! 


parameter as а limit 
error is an error associa 


364 


24.2 


[24.2] 


[24.3] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


tions. Ап error of measurement is usually construed to be an error 
associated with a particular observation which is an estimate of a true 
value. In most instances both population parameters and true values 
cannot be known but can only be estimated from fallible data. This circum- 
stance does not detract from the meaningfulness of these concepts, nor 
does it prevent the making of meaningful statements about the magnitude 
of error. 


EFFECT OF MEASUREMENT ERROR 
ON THE MEAN AND VARIANCE 


Consider a population of measurements. Each measurement is subject to 
error and may be written as X; = T; + е, where X, is the observed and T, 
the true measurement. By summation over all members of the population we 
obtain EX; = XT, + Хе. If we assume that measurement error is random, 
and as often positive as negative, we may write Xe, = 0. Consequently, the 
sum of measurements subject to error is equal to the sum of true measure- 
ments. It follows also that the means of the observed and true values are 
equal, both being equal to the population mean м. We conclude that 
measurement error exerts no systematic effect on the arithmetic mean. А 
mean based on a sample of N measurements will exhibit no tendency to be 
either greater than or less than the mean of true measurements, The expec- 
tations of the mean of observed and true scores are equal to the population 
mean и; that is, f 


E(X) =Е(Т) =p 


Measurement error exerts an effect оп the sampling variance of the arith- 
metic mean. This point is discussed in Sec. 24.6, 

Measurement error exerts a systematic effect on the variance. We may 
write (X, — u) = (T, — ш) + е. If we square this identity, sum over all 
members of the population, and divide by N,, where N, is the number of 
members in the population, we obtain 


X(X,—4) _ 5(Т,- и)? Ze? | 25(11- ме 
М, ША ДЕА 


р 


On the assumption that measurement errors are random and uncorrelated 


with true scores, the third term to the right is equal to zero, and we may 
write 


оу? = ст? + о,2 


Thus the variance of observed scores is equal to the variance of true scores 
plus the variance of the errors of measurement. For a fixed 77^, the more 
inaccurate the measurements the greater the value of oe 


and the greater 
the variance o,?. 


24.3 


[24.4] 


[24.5] 


24.4 


1 Test-retest method The 


ERRORS OF MEASUREMENT 365 


THE RELIABILITY COEFFICIENT 


Consider a situation where each member of a population has been mea- 
sured on two separate occasions. Two observations are available for each 
member. Both are presumed to be measures of the same attribute, and- 
both are subject to error. We may write І 


Хи = Tit en 
Xia — Tit ep 
In deviation form these become 
(Хи- в) = (T, — м) + en 
(Хе — в) = (Ti — u) + en 
By multiplying these two equations, summing over a population of Np 
members, and dividing by Муст», we obtain 
per = $ — и) (Xie — и) 
E М,010% 


- 5(Т:- и)? + Zenei + Уе (Т, м) + Xep(Ti — и) 
N,0102 


On the assumption that errors are random and uncorrelated with each 
other and with true scores, the three terms on the right in the numerator are 
equal to zero. Because the paired observations are теазагез of the same 
attribute, т = ту. Also X(T, — ш)? = №от?. Hence, writing 9; = 0% = ©, 


where ру is the reliability coefficient. The reliability coefficient is a simple 
proportion. It is the proportion of obtained variance that is true variance. If 
о? = 400 and 01? = 360, the reliability coefficient pz. = .90. This means 
that 90 per cent of the variation in the measurements is attributable to vari- 
ation in true score, the remaining 10 per cent being attributable to error. 


Where sample estimates are used we may write 


where гуу is the sample estimate of the reliability coefficient. 


METHODS FOR ESTIMATING RELIABILITY 


Above, the reliability coefficient has been discussed without reference to 
methods for obtaining such coefficients in practice. A number of different 
practical methods for estimating reliability are used. These methods are as 


follows: 
same measuring instrument is applied on two oc- 


casions to the same sample of individuals. When the instrument is a psy- 


366 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


chological test, the test is administered twice to a sample of individuals and 
the scores correlated. 


Parallel-forms method Parallel or equivalent forms of a test may be ad- 
ministered to the same group of subjects, and the paired observations 
correlated. Criteria of parallelism are required. 


Split-half method This method is appropriate where the testing procedure 
may in some fashion be divided into two halves and two scores obtained. 
These may be correlated. With psychological tests a common procedure is 
to obtain scores on the odd and even items. 


Internal-consistency methods These are used with psychological tests com- 
prised of a series of items, usually dichotomously scored, a 1 being as- 
signed for a pass and a0 for a failure. These methods require a knowledge 
of certain test-item statistics. 


The interpretation of a reliability coefficient depends on the method used 
to obtain it. When the same test is administered twice to the same group 
with a time interval separating the two administrations, some variation, 
fluctuation, or change in the ability or function measured may occur. The 
departure of г... from unity may be construed to result in part from error 
aud in part from changes in the ability or function measured, With many 
psychological tests the value of Tz; Will show a systematic decrease with 
increase in the time interval separating the two administrations, When the 
time interval is short, memory effects may operate. The subject may recall 
many of his previous responses and proceed to reproduce them. А 
spuriously high correspondence between measurements obtained at the 
two testings may thereby result. Regardless of the time interval separating 
the two testings, varying environmental conditions such as noise, tempera- 
ture, and other factors may affect the result obtained. Likewise, varying 
physiological factors, fatigue and the like, may exert an influence, 

In estimating reliability by the administration of parallel or equivalent 
forms of a test, criteria of parallelism are required. Test content, type of 
item, instructions for administering, and the like, should be similar for the 
different forms. Also the parallel fo; 


equal. Thus with three parallel tests the intercorrelations should be such 
that ri; = гуз = ra. A discussion of criteria for parallel tests is given by 
Gulliksen (1950). Situations arise where a large pool or population of test 
items is available. Samples of items may be drawn at random. Each sample 
of items is a randomly parallel form. This approach to the development of 
parallel tests has been studied at length by Lord (19554, 1955b). 

In many situations a single administration only of a test may be possible. 
The test is divided into two halves. A not uncommon procedure is to divide 


[24.6] 


[24.7] 


ERRORS OF MEASUREMENT 367 


a test into odd and even items. Scores are obtained on the two halves, and 
these are correlated. The result is a reliability coefficient for a half test. — 
Given a reliability coefficient for a half test, the reliability coefficient for a 
whole test may be estimated using the Spearman-Brown formula. This | 
formula is 

EE е 
ТР TFT | 
where rj, is the reliability of a half test. If, for example, ran = .80, then | : 
Tzz =".89. The Spearman-Brown formula provides an estimate of the relia- 
bility of the whole test. It estimates what the reliability would be if each 4 
test half were made twice as long. : 

The split-half method should not be used with highly speeded test mate- | 
rial. Obviously, if a test is comprised of easy items, and a subject is re- t 
quired to complete as many items as possible within a limited time interval, 
and all or nearly all items are correct, the scores on the two halves would 
be about the same and the correlation would be close to +1.00. 1 

A method of obtaining reliability coefficients using test-item statistics 
has been developed by Kuder and Richardson (1937). Many psychological 
tests are constructed of dichotomously scored items. An individual either 
passes or fails the item. А 1 is assigned for a pass, and 0 for a failure. The. 
score is the number of items done correctly. The proportion of individuals 
is denoted by the symbol рь and the proportion failing, by qi, 
An estimate of reliability is given by ` 


passing item i 
where qi= 1 — Pe 


2 с ` 
= ү 

в г 2 Didi р 

n= Ai i 

Ұ 

where п = number of items Я 

5,2 = variance of scores on test defined as E(X— X)2/N yn 

раз product of proportion of passes and fails for item i A 

» pq, = sum of these products for n items | 
ie ‘ 

This formula is frequently referred to as Kuder-Richardson formula 20. 
The coefficient rzz computed by this formula will take values ranging from j 


zero to unity. If the responses of individuals to the test items are assigned | 


n 
is equal to 22 pq; and the expectation of 


at random, the expectation of St 
ізі 
L 


гут iS Zero. If all items are perfectly correlated, a situation which can only 
arise when all have the same difficulty, rzz \ 
| 


items is the phi coefficient. 
If all assumptions implicit in the split-half method of estimating reliabil- | 


ity coefficients are satisfied, the split-half and Kuder-Richardson formula 


= 1. The correlation between 


368 


[24.8] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


20 will yield identical results (Ferguson, 1951). Because these assumptions 
are rarely, if ever, satisfied in practice, differences in the coefficients 
obtained will result. One difficulty with the split-half method is that a test 
may be split in a great many ways, yielding many different values of rzy. It 
may be shown that if a test is split in all possible ways, the average of all 
the split-half reliability coefficients with the Spearman-Brown correction is 
the Kuder-Richardson formula 20. This coefficient has a simple unique 
value for any particular test. 

The Kuder-Richardson formula 20 is a measure of the internal consis- 
tency, or homogeneity, or scalability, of the test material. In this context 
these three terms may be considered synonymous. If the items on a test 
have high intercorrelations with each other and are measures of much the 
same attribute, then the reliability coefficient will be high. If the inter- 
correlations are low, either because the items measure different attributes 
or because of the presence of error, then the reliability coefficient will be 
low. 

The Kuder-Richardson formula 20 may be applied to tests comprised of 
items which elicit more than two categories of response. Personality and 
interest inventories and attitude scales frequently permit three or more 
response categories. For a dichotomously scored item we note that Pig is 

n п 
the item variance 52 and > ра: = > 52, the sum of the item variances. 
ізі ізі 
For an item with more than two response categories, where each category 
has been assigned a weight, the individual item variances may be calcu- 
lated and their sum may be substituted in Kuder-Richardson formula 20 for 


п 
Уу Рид. Consider a test comprised of statements which elicit the possible 
= 


responses "agree," “undecided,” “disagree.” Let Рі, рг, and рз be the 

proportion of individuals responding in the three categories. If weights 3, 2, 

1 or +1, 0, —1, or any other system of weights, are assigned to the 

categories, the item variance may be calculated. These may be summed, 
n 

and the sum substituted for > Didi. The quantity 5,2 is, of course, the 
г 

variance of scores obtained by summing items with the assigned weights. 

For further discussion see Ferguson (1951). 

On the assumption that all test items are of equal difficulty, a simplified 
form of the Kuder-Richardson formula may be obtained for use with dichot- 
omously scored test items. This formula may be written as 


Trr 


=т=т [:-“®-®] 


И ns; 


where Y is the mean test score and 5,2 is the variance. This formula is 
referred to as Kuder-Richardson formula 21. The formula may be derived 


using the assumptions implicit in the concept of randomly parallel tests 
(Sec. 24.9). 


24.5 


[24.9] 


24.6 


[24.10] 


ERRORS ОЕ MEASUREMENT 369 


EFFECT OF TEST LENGTH ON 
THE RELIABILITY COEFFICIENT 


In discussing split-half reliability, a formula was given for estimating the 
reliability of a whole test from the reliability of a half test. This formula is a 
particular case of a more general Spearman-Brown formula for estimating 
increased reliability with increased test length. The more general formula is 


— SS 

ke T+ (k= Уға 

where гу = an estimate of reliability of a test of unit length 
тк = reliability of test made А times as long 


If rz» = -60 and the test is made four times as long, the reliability coeffi- 
cient rj, for the lengthened test is estimated as .86. From a theoretical 
point of view a test may be made as reliable as we like by increasing its 
length. Practical considerations, of course, restrict test length. 

Because reliability is a function of test length, reliability coefficients 
calculated on tests of different lengths are, for certain purposes, not 
directly comparable. If, for example, we wish to compare the reliability of 
different types of test material, we presumably should require measures. 
which were independent of the differing lengths of the tests. One proce- 
dure here is to use the Spearman-Brown formula and calculate reliability 
coefficients for a standard test of 100 items. If a test has 40 items, then a 
value of k = 190 = 2.50 would be used in estimating the reliability of the 
standard test. If another test has 150 items, then k = 190 = .67, and so on. 


Thus a comparison of the reliabilities of different tests may be made which 


is independent of differing test lengths. 


EFFECT OF MEASUREMENT ERROR ON THE 
SAMPLING VARIANCE OF THE MEAN 
nt error affects the variance of a set of measurements 


Because measureme:! 
f the mean. The sampling variance 


it will also affect the sampling variance о 
of the arithmetic mean may be written as 


The component g7?/N is the sampling variance of the means of samples of 
true measurements, and се?// is the component of the sampling variance 
attributable to measurement error. While measurement error exerts no 
he sample mean as an estimate of м, such error 
le means with repeated sampling. The 
at with no measurement error present 


systematic effect on t 
increases the variation in samp 
increase in sampling variance over th. 
is o2/N. 


The rati of true scores to the 


o of the sampling variance of the mean 


370 


[24.11] 


24.7 


[24.12] 


[24.13] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


sampling variance of the mean of obtained scores is the reliability coeffi- 
cient. Thus 


Рхх c; o/N oF 


This means that the reliability coefficient may be interpreted as descriptive 
of the loss in efficiency of estimation resulting from measurement error. To 
illustrate, a mean calculated on a sample of 100 cases, where Prx = .80, 
has a sampling variance equal to that of a mean calculated on a sample of 
80 cases where p++ = 1.00. The loss in efficiency of estimation resulting 
from measurement error amounts to 20 cases in 100. 


EFFECT OF ERRORS OF MEASUREMENT 
ON THE CORRELATION COEFFICIENT 


and ¥ in the population, the relation between the correlation of true and 
obtained scores is given by 


Рта == 
à DzzDyy 


where рт у = correlation between true scores 
Pzz = reliability of X 
Риу = reliability of Y 


Try 
Jar 
V Гг, 


» Tax = .80, and r,,— .90. The correlation 
between true scores on X and Y, estimated by the above formula, is .707. 
$ attenuated from .707 to .60 because of 
uares of these coefficients yield a better 


capacity. If the correlation between two variables is low, 
will not be markedly increased by improvements in reliabili 
lation is high, improving reliability may result in substant 
prediction of one variable from another. 


ty. If the corre- 
ial gains in the 


м. 


24.8 


[24.14] 


24.9 


[24.15] 


[24.16] 


[24.17] 


ERRORS OF MEASUREMENT 371 


Because the correlation between true scores can never exceed unity, the 
maximum correlation between two variables arises where r; 7, = 1 Und 
this circumstance rz, — This i i Шах a 

с хи = Vrzzlyy- This is an estimate of the maximum cor- 
relation between X and Y. If rz, = -80 and ry, = .90, the maximum possi- 
ble correlation between X and У is estimated as V.80 X .90 = .85 


RELIABILITY ОҒ, DIFFERENCE SCORES 

Situations arise where the difference between two sets of measurements is 3 

defined as a score. The two measurements may be initial, or prestimulus 

values, and values obtained in the presence of a stimulus factor. If dif- 

ences are obtained between standard scores on X and Y, that is, 
^ 


fer 
the reliability of the differences may be estimated by 


between 2, and zy, 


E Tre + Tw — Try 
à 2 rey р 
d ry, = reliability coefficients for X and Ts 


where гуу ап 
гаа = reliability of difference 2, — zy 


For fixed values of rxz and ry, the reliability of the difference will decrease 
with increase in rz, from zero. If rz, = .90 and гу = .80, for rzy = .80 the 
reliability of differences гаа = .25. For rz, = 0, гаа = .85. As rz, departsina | , 
positive direction from zero, the error variance accounts for an increasing. 
proportion of the total variance of differences, with a resulting decrease in 
reliability. The point to note here is that difference scores may be grossly 
unreliable and should be used only after careful scrutiny of the data. When | 
the correlation between the two variables is reasonably high, it is probable 
that with many sets of data most of the variance of differences is error 


variance. 


THE STANDARD ERROR OF MEASUREMENT 


Because руу = ole? and а = от? + oè, we may write ^ 


and 


Oe = Or Vl— Рхг 


s latter formula is the standard error of measurement. Where s,and rre | 


Thi 
and руу, we obtain 


are used as estimates of ту 


Se= Sz Vl— rz 


as the corresponding sample esti 
measurement are independent of 
be used as the standard error associate 


mate. If it may be assumed that errors of _ 
the magnitude of test score, then s, may | 
d with a single score and inter- 


372 


[24.18] 


[24.19] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


preted in the same way as the standard error of any statistic. On the 
assumption of a normal-curve approximation, the 95 and 99 per cent con- 
fidence intervals of an individual’s score X; are estimated by X, = 1.965, 
and Х, = 2.58s,, respectively. With most psychological tests, however, 
errors of measurement are not independent of the magnitude of test score. 
The standard error is higher in the middle-score range and diminishes in 
size as the score departs from the average. Because of this the use of Se to 
estimate confidence intervals for particular scores may yield misleading 
results. The variance 5,2 is a sort of average value, and 5, when applied to 
particular scores has meaning only in relation to scores near the average. 

The problem of the standard error of measurement associated with psy- 
chological test scores has been investigated by Lord (1955a, 1955b, 1957). 
Lord defines the standard error of measurement as the standard deviation 
of scores an individual might be expected to obtain on a large number of 
randomly parallel test forms. The assumption is that the ability of the indi- 
vidual remains unchanged and is not affected by practice, fatigue, and the 
like. Randomly parallel forms are viewed as composed of items drawn at 
random from a large pool or population of items. The items are scored 1 for 
а pass and 0 for a failure, a score on a test being the sum of item Scores. 
The proportion of items in the population which individual i can do сог- 
rectly is Ө. The true score of individual i for a test of n items is T, = пф. 
The number of items done correctly by individual i for a random sample of 
п items is X;. The standard deviation of the sampling distribution of the 
X/'s is the standard error. This is obtained from the standard deviation of 
the binomial and is given by 


166) = М пб; (1 — 6) 


= Тїт—Т) 


Ап individual's score X, may be used as an estimate of 7), 
factor n/(n — 1) to obtain an unbiased estimate yields 


s) = y An X) 


This formula may be used for estimating the standard err 
X,. Where n = 100 and X; — 50, Se(X;) = 5.02. Where Х, 
Se(X;) = 4.02. The standard error diminishes in size as 
values are approached, 


Lord (1955a) has shown that if 5,2 is taken as the average of 5,?(Х;) and 


substituted in ry= 1— 5,252, unbiased variance estimates being used 
throughout, Kuder-Richardson formula 21, described in Sec. 24.4, is 
obtained. 


In most practical situations where 


Introducing the 


ог of a test score 
= 80 and n = 100, 
the more extreme 


parallel tests are used, the tests are not 
randomly parallel in the strictest sense. The items are matched to some 


extent. The standard error for such tests will be less than that estimated by 


24.10 


ERRORS ОЕ MEASUREMENT 373 


5е(Х). Thus 5,(Х4) in most situations will tend to be a moderate overes- 
timate. It is of interest to note that s-(X;) is independent of the character- 
istics of the items of which à test is comprised, provided, of course, that 
these are scored 1 for a pass and 0 for a failure. : 1 


CONCLUDING OBSERVATIONS 

The theory and method associated with the study of measurement error in 
psychology have been developed in relation to psychological testing. Much | 
of this theory and method is generally applicable to measurements of all 
kinds. Little attention has been directed to the study of measurement error 
by experimental psychologists. It is probable that in much work in the field 
of human and animal learning, fairly gross error attaches to many of the 
measurements made. Reliability coefficients less than .50 are not 
uncommon, and coefficients of zero are perhaps not isolated curiosities. 
The errors which attach to measurements in the field of animal experi- 
mentation are known quite often to be substantial. Low reliability does not 
necessarily invalidate a technique as a device for drawing valid inferences. 
Low reliability may be compensated for by increase in sample size. An 
unreliable technique used with a small sample is, however, capable of 
detecting gross differences only, and the probability of not rejecting the 
null hypothesis when it is false may be high. When significant results are 
reported with an unreliable technique on a small sample, the treatment 
applied is usually exerting a gross effect. 

А common type of experimental design requires the making of measure- 
ments on an experimental group in the presence of a treatment and ona 
control group in the absence of the treatment. Although substantive evi- 
dence is lacking, it is probable that in many experiments the measure- 
ments are less reliable under the experimental than under the control con- 
ditions, one of the effects of the treatment being to increase measurement 
error. It seems probable that this effect is more likely to occur when the 
treatment is in the nature of a gross assault on the normal functioning of 
the organism, as is the case with certain drugs, stress agents, and operative 
procedures. Experimental situations may be found where the treatment 
may increase rather than decrease the reliability of the measurements. 
This author can recall one experiment where the important effect of the 
treatment was to stabilize and make more reliable the responses of the 
experimental animals. 

The discussion of measurement error given in this chapter is of necessity 
brief and incomplete. The most comprehensive, and theoretically most 
advanced, discussion available on measurement error as applied to psy- | 
chological tests is found in Lord and Novick (1968). A straightforward and 
more elementary discussion of this topic is given by Magnusson (1967). For 
a consideration of the analysis of variance as applied to test reliability and 
other specialized topics, including the Kuder-Richardson formulas, the 
reader is referred to the monograph by Jackson and Ferguson (1941). 


374 


EXERCISES 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


eee 


1 Forr,.-— .90 and s, = 15, estimate the variance of true scores and the 
error variance. What percentage of the obtained variance is due to 
error? 


2 The following are correlations between half tests: -30, .50, .72, .80, .96. 
Find reliability coefficients for the whole tests. 


3 The following are difficulty values for a test of 20 items: 


1.97 6 .53 11 .50 16 .04 
2 .95 T 515 12 .55 17 .35 
з .76 8 .40 13 .42 18 .27 
4 .80 9 .82 14 .30 19: /15 
5 .60 10 .20 йб 20 .09 


The standard deviation of test scores is 6.5. Calculate reliability coeff- 
cients using both Kuder-Richardson formulas 20 and 21. Explain the 
difference between the two coefficients. 


4 Fora particular test, Tzz = .50. What is the effect on the reliability co- 
efficient of making the test five times as long? 


5 The sampling variance of an arithmetic mean of a test is 6.2 where 
Tzz = .80. What part of the sampling variance is due to sampling error, 
and what part to measurement error? If the test were made three times 
as long, what proportion of the sampling variance of the mean of the 
lengthened test would be due to measurement error? 


6 Estimate the correlation between true scores on X and У where 
Try = .60, rz, = .80, and r,, = .90. What is the maximum possible cor- 
relation between X and У? 


7 For the data of Exercise 6 above, calculate the reliability of difference 
Scores in standard-score form between X and Y. 


8 Estimate the standard error associated with the individual scores 7, 26, 
and 44 for a test of 50 items. 


bike ee Бұл < 


25.1 


а ы жаб аа eee ee 


SCORE TRANSFORMATIONS: NORMS 


INTRODUCTION 
Many varieties of transformations are used in the interpretation and analy- 
sis of statistical data. A transformation is any systematic alteration in a set 
of observations whereby certain characteristics of the set are changed and 
other characteristics remain unchanged. The representation of a set oR 
observations X as deviations from the mean X — X =x is a simple transfor- 
mation. The mean of the transformed values is zero. All other characte: 
istics of the transformed values are the same as those of the original values. | 
The variability, skewness, and kurtosis remain unchanged. The ordinal 
properties of the data are preserved. The rank ordering of the observations | 
is the same as before. The transformation of a variable X to standard-score 
form (X — X)/s=z results in a change both in mean and standard devia- 
tion. The mean of the transformed values is zero, and the standard devia- ) 
tion is unity. Skewness, kurtosis, and rank order are unchanged. У 
Certain commonly used transformations change the shape of the 
frequency distribution of the variable. The variable may, for example, be 
transformed to the normal form: This may involve not only a change in 
mean and standard deviation, but also a change in skewness and kurtosis. 
The original observations may be negatively skewed and leptokurtic. Th 
transformed values may be normally distributed, or approximately so. This | 
type of transformation does not change the rank order of the observations \ 
The transformations most commonly used by psychologists that alter the 
shape of the frequency distribution are to the normal and rectangular 
forms. The conversion of a set of observations to percentile ranks is a 
transformation to a rectangular distribution. v. 
The conversion of a set of frequencies fi, №, №... ‚ f; to proportions 
D 


375 


376 


td 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


by dividing each frequency by N, or to percentages by dividing by N and 
multiplying by 100, is a simple transformation. The ordering of the trans- 
formed values is the same as the ordering of the original frequencies. If 
each frequency is divided by different values of М, say №, №», №М,..., 
Мк, then the transformed values will quite probably have an order different 
from the original values. The conversion of a mental age to an intelligence 
quotient by dividing by chronological age and multiplying by 100 is a trans- 
formation which changes the ordinal properties of the data, In converting 
mental ages to intelligence quotients, not only is the order changed, but 
also the mean, standard deviation, skewness, and kurtosis, The trans- 
formed values have a mean of 100 in the standardization group and are 
approximately normally distributed with a known standard deviation. 

Raw scores on psychological tests are usually highly arbitrary. The 
values of the mean, standard deviation, and possible range of scores reside 
in large measure in the predilection of the test constructor. Unless the 
mean, standard deviation, and something about the shape of the score dis- 
tribution are known, no proper interpretation can be attached to the origi- 
nal, or raw, scores. Such scores are frequently transformed to normal dis- 
tributions with an agreed mean and standard deviation. For example, a psy- 
chological test when administered to a representative sample of individuals 
from the population for which the test is intended may have a mean of 37 
and a standard deviation of 9.6 and be positively skewed. Scores may be 
transformed to a normally distributed variable with a mean of 100 and a 
standard deviation of 16. Scores thus transformed to the normal form 
immediately take on meaning, If an individual has a score of 116, we know 
that he is one standard deviation unit above the average. Because the 
Scores are normally distributed we know that his performance is better 
than that of about 84 per cent of the population and below the performance 
of about 16 per cent of the population. The procedure for developing such a 
transformation is known as standardization. A psychological test is said to 
be standardized when transformed scores are available, based on a refer- 
ence group of acceptable size. The transformed Scores themselves are 
called norms. An individual's score takes on meaning in relation to a stan- 
dard, or normative, group. Tests are frequently standardized to permit age 
allowances. This means in effect that separate norms have been prepared 
for each age group. The average child in each age group may have a mean 
transformed score of, say, 100. The standard deviation of scores for each 
age group may be 16. Thus a younger child may make a lower raw score 
than an older child but have a considerably higher transformed score. 
Intelligence quotients are transformed scores which make adjustments for 
the differing chronological ages of children taking the test. Intelligence 
quotients are presumed to be independent of chronological age within an 
accepted age range. Most published tests are accompanied by manuals 
containing conversion tables which permit the transformation of raw 
scores to standardized scores. Both normal and rectangular transforma- 
tions are used in test standardization. 


SCORE TRANSFORMATIONS: NORMS 377 


25.2 TRANSFORMATIONS TO STANDARD MEASURE 


A standard score is a deviation from the mean divided by the standard 
deviation; thus z= (X — X)/s. The mean is the origin, and the standard 
deviation is the unit of measurement. Thus a particular value is z standard- 
deviation units above or below the mean. The mean of z scores is zero, and 
the standard deviation is unity. The skewness and kurtosis of the distribu- 
tion are unchanged. The distribution of z scores has the same shape as the 
distribution of X. Standard scores on two or more variables are directly 
comparable only in the sense that they have the same mean and standard 
deviation. - 
A standard-score transformation does not change the proportionality of 
scale intervals. If X,, X», and X, are three measurements in raw-score form 
and 21, z;, and z are the same three measurements in standard-score form, 


then 


This means that the relative distances between the variate values remain 
unchanged under a standard-score transformation, Let Ху, X», and Хз be | 
20, 30, and 50. If X — 40 and s — 15, then 21, Z2, and з become —1.33, —.67, | 
and .67. We note that 


20-30 _ —1.33 + .67 _ 50 
30-50 —.67—.67 ` 


Standard scores involve the use of decimals and plus and minus signs. 
This is sometimes inconvenient. Also the range of values will seldom 
exceed the limits —3 and +3. It is not uncommon to select an arbitrary 
origin and standard deviation to ensure that all, or nearly all, the measures 
have a plus sign and that decimals are eliminated. For this purpose a mean 
of 50 and a standard deviation of 15 are sometimes used. If z' denotes this 
type of score, then 


2-50-15 =) 


= 50 + 152 


То change the standard deviation we multiply every standard score by 15. 
origin we merely add 50. Values of z' are rounded to the 


To change the 
standard- 


nearest integer. In comparing performance on a series of tests, 
score values z' are more convenient than z. Of course, any other mean and 


standard deviation could be selected. 


PERCENTILE POINTS AND PERCENTILE RANKS 
psychological tests transformations to percentile 


In the standardization of 
Such transformations are rectangular. 


ranks have frequent application. 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


Each percentile rank has the same frequency of occurrence. The 
frequency distribution is flat. 

A clear distinction must be made between percentile Points and percen- 
tile ranks. If k per cent of the members of a sample have scores less than a 
particular value, that value is the Ath percentile point. It is a value of the 
variable below which А per cent of individuals lie. On an examination, if 85 
per cent of individuals score less than 60, then 60 is the 85th percentile 
point. If a frequency distribution is represented graphically and ordinates 
raised at all percentile points, the total area under the frequency distribu- 
tion is divided into 100 equal parts. 

Percentile points may be represented by the symbols Ро PPS, 

1. The points P, and Р are limits which include all members of the 
sample. A percentile rank, as distinct from a percentile point, is a value on 
the transformed scale corresponding to the percentile point. If 60 is a score 
below which 85 per cent of individuals fall, then 85 is the corresponding 
percentile rank. As in all transformations, values on the original scale cor- 
respond to certain values on the transformed scale. In the present context 
the values on the original scale are percentile points, the corresponding 
values on the transformed scale are percentile ranks. 

The reader will recall that the median is a value of the variable above 
and below which 50 per cent of cases lie. The median is the 50th percentile 
point, Ру. The upper quartile is a value of the variable above which 25 per 


A decile point is a value of the variable below which a certain percentage of 
individuals fall, the percentage being taken in units of 10, Decile ranks are 
transformed values corresponding to the decile points and taking the 
integer values 1 to 10. The median is the 5th decile. An ordinate at the 


into 100 equal parts. 


For small V the computation of percentile points and percentile ranks is 
not a very meaningful procedure. Given the scores 8, 17, 23, 42, 61, and 63, 
obviously little meaning could Possibly attach to Ps, or P, 
of these scores to percentile ranks would be a so 
dure, with no advantage over ordinary ranks. 


во- The conversion 
mewhat spurious proce- 


COMPUTATION OF PERCENTILE POINTS 
AND RANKS—UNGROUPED DATA 


To illustrate the computation of percentile points and ranks for ungrouped 
data, consider the psychological-test scores tabulated in Table 25.1. We 
adopt the convention that any score value X has exact limits given by 


Table 25.1 


SCORE TRANSFORMATIONS: NORMS 379. 


Psychological-test scores for a group of 60 children arranged in order 


Individual Score Individual Score Individual Score. 
1 83 21 110 4l 123 
2 88 22 110 42 124 — 
3 88 23 110 43 124 
4 91 24 110 44 125 . 
5 91 25 1n 45 125 А 
6 93 26 112 46 195 | 
7 93 27 114 47 126 - 
8 93 28 115 48 126 
9 97 29 116 49 127 
10 98 30 116 50 128 

P. 
11 98 31 116 51 130 
12 98 32 117 52 т. 
13 100 33 118 53 131. E 
14 101 34 119 54 182 | 
15 103 35 120 55 185 | 
16 07 36 11 56 E 
17 107 37 122 57 136 
18 108 38 123 58 136 
19 109 39 123 59 136 


X = .5 and X + .5. The variable is presumed to be continuous. Thus th 
score 116 has exact limits 115.5 and 116.5. This convention is the same as 
that used in determining the exact limits of class intervals. Let us now 
calculate Ра, the 40th percentile point, the point below which 40 per cent 
of individuals lie. N = 60, and 40 per cent of this is 24. The 24th individual 
has a score of 110, the exact upper limit is 110.5, and this is taken as the 
cale below which 24 individuals lie. Thus Р, = 110.5. 
Note in this case that the 25th individual has a score of 111. The exact 
lower limit of this score is also 110.5. Consider now the calculation of Psy. 
We require a point on the test scale below which 12 and above which 48 
individuals lie. The score of the 12th individual is 98 with an upper limit of 
98.5. We note also that the score of the 13th individual is 100 with a lower 
limit of 99.5. Presumably the median falls somewhere between 98.5 and. 
99.5. Tt is indeterminate. Аз an arbitrary working procedure the percentile. 
Р» may be taken halfway between these two values. Thus Ps, = 99.0. To 
illustrate the handling of ties in the computation of percentile points let us 


point on the test s 


330 


[25.1] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


calculate Ру. A score is required below which 6 and above which 54 indi- 
viduals fall. We note that individuals 6, 7, and 8 have the same score, 93. 
Thus three individuals have scores within the exact limits 92.5 and 93.5. 
Since we require a point below which 6 individuals fall, we interpolate 
one-third of the way into this interval. One-third of this interval is .33, and 
Рь = 92.50 + .33 = 92.83. With the above data Ру may be taken as the 
lower exact limit of the lowest score, or 82.5. Similarly, Рио may be taken 
as the upper exact limit of the highest score, or 139.5. 

The calculation of percentile ranks as distinct from percentile points is 
the reverse of the above process. Above we calculated scores corre- 
sponding to particular ranks. We may now attend to the calculation of ranks 
corresponding to particular scores. To illustrate, consider individual 32 in 
Table 25.1. This individual is 32d from the bottom. His test score is 117. 
The number of individuals scoring below 117 is 31. The percentage below 
is 0 X 100 =51.67. The number scoring above 117 is 28. The percentage is 
% X 100 = 46.67. These two percentages do not add to 100. Individual 32 
occupies gg X 100 = 1.67 per cent of the total scale. His percentile rank 
falls between 51.67 and 51.67 + 1.67 = 53.33. We may take the mid-point 


of this interval as the required percentile rank. Thus the percentile rank 
corresponding to score 117 is 


51.67 4- UE — 82.50 


This method assumes that any rank R covers the interval R — .5 and 
ІСЗЕЗ5; 

Consider the question of ties. We note that five individuals score 110. 
The number of individuals scoring below 110 is 19, or 48 x 100 —21.67 per 
cent of the total. The number scoring above 110 is 36, or $6 X 100 = 60.00 
per cent. The number occupying the score position 110 is 5, or X 100 = 
8.33 per cent. The required percentile rank may be taken as the mid-point 
of the interval 31.67, and 31.67 + 8.33 = 40.00. Thus the percentile rank of 
the score 110 is 31.67 + 8.33/2 = 35.83. 

Percentile ranks may be obtained by using the simple formula 
Келе 

N 


where R = rank of individual, counting from the bottom 
N = total number of cases 


PR = 100 


Where ties occur, R is taken as the average rank which the tied observa- 
tions occupy. The average rank of the five individuals who score 110 is 22, 
and the corresponding percentile rank is, as before, 


, 22—.5 
100 60 


= 35.83 


Percentile ranks are ordinarily rounded to the nearest whole number. Thus 
the rank 35.83 becomes 36. 


SCORE TRANSFORMATIONS: NORMS 381 1 


25.5 CALCULATION OF PERCENTILE POINTS 


[25.2] 


AND RANKS—GROUPED DATA 

The calculation of percentile points and ranks for grouped data will be dis- 
cussed with reference to the data of Table 25.2. Cumulative frequencies 2 
are recorded in col. 3, and cumulative percentages in col. 4. 

Let us calculate Р». N = 200, and 25 per cent of Nis 50. We observe that 
the 50th case falls within the interval 65 to 69. The exact limits of this 
interval are 64.5 and 69.5. We must now interpolate within the interval to 
locate a point below which 50 cases fall. We note that 36 cases fall below | 
and 26 cases within the interval containing P»s. To arrive at the 50th case M 
we require 14 of the cases within the interval. Thus we take 4¢ of the - : 
interval 64.5 to 69.5. This is 4 X5 2.69. We add this to the lower limit of р 
the interval to obtain Pss, which is 64.5 4- 2.69 — 67.19. 

The following formula may be used to calculate percentile points: 


7 xh ый 
where P, = kth percentile point a 
р = proportion correspondin, 
р = .62 

1, = exact lower limit of interval containing Р; 

` F = sum of all frequencies below L 

{= frequency of interval containing P; 

h = class interval 


g to ith percentile point; thus if i — 62, 3 


| 
| 
For Р» in Table 25.2 we have L = 64.5, p, = -25, F = 36, Ж = 26, and |j 
h = 5. Thus | 
Е 36, 5 = 67.19 Y 
6 
al with that obtained previously for P25. The reader | 


This result is identic 
his formula is the same as that given previously | 
E 
р 


will observe that for Pso t 
for calculating the median from grouped data. { 
The calculation of percentile ranks is the reverse of the above proce- 


dure. The cumulative percentages shown in col. 4 of Table 25.2 are the per- 
centile ranks corresponding to the exact top limits of the intervals. Thus | 
56.0 is the percentile rank corresponding to the percentile point 74.5, the 
exact top limit of the interval 70 to 74. Likewise 11.0 is the percentile rand 
corresponding to the percentile point 59.5, the exact top limit of the | 
interval 55 to 59. The percentile rank of any score may be obtained by | 
interpolation. What is the percentile rank corresponding to the score 81? — 
The score 81 falls within an interval with exact limits 79.5 and 84.5. It is 1.5. 
score units above the bottom of this interval. The lower limit has a регсеп- 
tile rank of 76.0 and the upper limit 92.5. Thus we have two points on the 
score scale corresponding to two points on the percentile-rank scale. Five 
units on the score scale is equal to 92.5 — 16.0 — 16.5 units on the per- 
centile-rank scale, and 1.5 units on the score scale is equal to (92.5 — 


— a 


382. 


таме 25.2 


Tt NOE 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


Cumulative frequencies and percentages of test scores 


1 2 3 4 
Class Cumulative Cumulative 
interval Frequency frequency percentage 

95-99 1 200 100.0 
90-94 6 199 99.5 
85-89 8 193 96.5 
80-84 33 185 92.5 
75-79 40 152 76.0 
70-74 50 112 56.0 
65-69 26 62 31.0 
60-64 14 36 18.0 
55-59 10 22 11.0 
50-54 6 12 6.0 
45-49 4 6 3.0 
40-44 2 2 1.0 
Total 200 


Е 


76.0)1.5/5 = 4.95 units оп the rank scale. We now take 76.0 + 4.95 = 
80.95 as the percentile rank of the score 81. Rounding this to the nearest 


percentile rank is numerically equal to the score. 


The steps involved in finding percentile ranks from grouped data may be 
summarized as follows: 


1 Find the exact lower limit of the interval containing the score Х whose 
percentile rank is required. 


2 Find the difference between X and the lower limit of the interval con- 


taining it. 

3 Divide this by the class interval and multiply by the percentage within 
the interval. 

4 Add this to the percentile rank corresponding to the bottom of the 
interval. 


the manner described above. A somewhat easier Procedure is to make a 


ra 


SCORE TRANSFORMATIONS: NORMS 383 


graphical plotting on suitable graph paper of cumulative percentages 
against the corresponding upper limits of the class intervals. Score values 
are plotted on the horizontal axis, and cumulative percentages on the ver- ` 
tical axis. The points may be joined by straight lines. Percentile ranks cor- 
responding to scores may then be read directly from the graph. If the 
points are joined by straight lines, these rank values will be the same, 
within limits of error, as those obtained by linear interpolation directly on 
the numerical values. If the sample is small, the points when plotted may 
show considerable irregularity and it may be advisable to fit a smoothed 
curve to the data. The fitting of a smoothed curve by freehand methods is 
accurate enough for most practical purposes. A procedure related to the 
method described above is to calculate certain selected percentile points 
and then interpolate either numerically or graphically between these 
points. The percentile points Pio Pans Ра» - + + Ри шау be calculated. | 
To achieve greater accuracy at the tails of the distribution it may be 
desirable to calculate P; and Р»; also Ps; and Pss. 


NORMAL TRANSFORMATIONS 

The transformation of a variable to the normal form is a frequent procedure 
in test standardization and correlational analysis. Not uncommonly, test | 
rms are normal transformations of the original raw scores with arbi- 
nd standard deviations. A type of normal transfor- 
nists is a T score. T scores are normally distrib- - 
{ 50 and a standard deviation of 10. A normal і 
{ 100 and а standard deviation of 15, ог 


no 
trarily selected means a 
mation used by educatio! 
uted, usually with a mean о 
transformation with a mean o 


thereabouts, resembles an IQ scale. 

H 1 "n гр 4 

Transforming a set of scores to the normal form is a relatively simple 
rf 


procedure. Every percentile rank corresponds to a point on the base line of - 
the unit normal curve measured from a mean of zero in standard deviation 
units. A percentile rank of 50 corresponds to the zero point. A rank of 60 is 
295 standard deviation units above the mean. A rank of 70 is .52 standard 
deviation units above the mean. Table 25.3 shows points on the base line of 
the unit normal curve corresponding to selected percentile ranks. These 
and other points are readily obtained from any table of areas under the 
normal curve (Table A of the Appendix). 

In summary, the steps used in transforming a variable to the normal 
form are as follows. Percentile ranks corresponding to certain points on | 
the score scale may be calculated. A table of areas under the normal curve | 
is used to find the points on the base line of the unit normal curve corre- 
sponding to these percentile ranks. These points correspond to the percen- - 
tile points on the original score scale. Thus a correspondence is established 
between a set of points on the original score scale and points on a normal 


distribution of zero mean and unit standard deviation. Percentile ranks are 
lishing this correspondence. The normal standard | 
btain any desired standard devia- 


stepping-stones in estab 


scores are multiplied by a constant to 0 


384 


Table 25.3 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


Points on the base line of the 
unit normal curve correspond- 


ing to selected percentile ranks 


Percentile Standard 
rank deviation 
99 . +233 
95 +1.65 
90 +1.28 
80 +0.84 
70 +0.52 
60 +0.25 
50 0.00 
40 —0.25 
30 —0.52 
20 —0.84 
10 —1.28 
5 —1.65 
—2.33 
м— 


1 


tion of the transformed values. A constant is usually added to produce a 
change in means, thus eliminating negative signs. A transformed value cor- 
responding to any score value on the original scale may be obtained by 
interpolation. 

Some freedom of choice is possible in the selection of a set of points on 
the score scale with associated percentile ranks. First, we may use the 
exact top limits of the intervals and obtain the corresponding percentile 
ranks from the cumulative-percentage frequencies. Second, we may take 
the mid-points of the class intervals and obtain percentile ranks corre- 
sponding to these. Third, we may use a selected set of percentile points · 
with associated percentile ranks. Thus Pus РАНЕ us Po may be 
used, Ри, P5, and Pos, Pog may be added at the tails as a refinement, Fourth, 
we may select certain equally spaced points on the normal standard-score 
scale and ascertain their percentile ranks and the corresponding per- 
centile-point scores. These equally spaced points may, for example, be 
—2,5, —2:0, —1.5, .-  « 4-15, +2.0, +2.5. The difference between the 
four alternatives outlined above is a matter of units. The first uses units of 
class interval of the original variable, a unit extending from the top of one 
interval to the top of the next. The second also uses units of class interval 
of the original variable, a unit extending from the mid-point of one interval 
to the mid-point of the next. The third, excluding the tails, uses equal units 


Table 25.4 


SCORE TRANSFORMATIONS: NORMS 385 | 


Illustration of the transformation of scores to a normal distribution—data of- 


Table 25.2 
Cumulative Cumulative Normal T score 
frequency percentage standard — — — —— 
Class Mid- to mid- to mid- deviation zX10 2x10 
interval point Frequency point point unit z +50 
1 2 Б] 4 5 6 7 8 
Е НН LL o 
95-99 97 1 199.5 99.75 2.81 28.1 78.1 
90-94 92 6 196.0 98.00 2.06 20.6 70.6 
85-89 87 8 189.0 89.50 1.25 12.5 62.5 
80-84 82 33 168.5 84.25 1.00 10.0 60.0 . 
75-19 77 40 132.0 66.00 41 4.1 54.1 
70-74 72 50 87.0 43.50 —.16 —1.6 48.4 
65-69 67 26 49.0 24.50 —.69 —6.9 43.1 
60-64 62 14 29.0 14.50 —106 —10.6 39.4. 
55-59 57 10 17.0 8.50 —1.37 —13.7 36.3: 
50-54 52 6 9.0 4.50 —1.70 —17.0 33.0 
45-49 47 4 4.0 2.00 —2.05 -20.5 29.5 
40-44 42 2 1.0 .50 —2.58 —25.8 24.2 
Total ... 200 ; 


оп the percentile-rank scale. The fourth alternative uses equal units on the 
normal standard-score scale. While minor advantages may be claimed for. 


one procedure in preferen 
large, are trivial. Any one 0 


most practical purposes. 
To illustrate the transformation of a set of scores to the normal form we 


shall use the second alternative and take the mid-points of the class 
intervals with their corresponding percentile ranks. Table 25.4 shows a 
frequency distribution of test scores. Column 2 shows the exact midpoints 
of the class intervals. Column 3 shows the frequencies. Column 4 shows 
the cumulative frequencies to the mid-points. These are the cumulative 
frequencies to the bottom of the interval plus half the frequencies within 
the interval. The number of cases 
number within the interval is 14. Half this number is 7. The cumulative 
frequency to the mid-point is 22 +7 = 29. Column 5 shows the cumulative 
percentage frequencies to the mid-points. These cumulative percentage 
frequencies are percentile ranks corresponding to the mid-points of the | 
1. 6 are points on the base line of the unit 


intervals. The numbers in co 
normal curve in standard deviation units from a zero mean. The percentage 


ce to another, the differences, where М is fairly | 
f the four procedures is satisfactory enough for 


Я 


below the interval 60 to 64 is 22. The | 


25.7 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


of the area of the unit normal curve falling below a standard score of 2.81 is 
99.75, the percentage below a standard score of 2.06 is 98.00, and so оп. 
These values are normalized standard Scores corresponding to the 
mid-points of the original score intervals, Thus we have a set of values on 
the original scale paired with a set of values on а normal transformed scale. 
Transformed values corresponding to any score on the original scale may 
be obtained by either arithmetical or graphical interpolation. 

Table 25.4 shows a T-score transformation. In col. 7 the standard scores 
of col. 6 are multiplied by 10, thus yielding transformed scores with a stan- 
dard deviation of 10. In col. 8 a constant value 50 is added to the values of 
col. 7, thus changing the origin from zero to 50 and eliminating negative 
values. If we had multiplied by 15 and added 100, the transformed values 


THE STANINE SCALE 


During World War II the United States Army Air Force Aviation Psychol- 
ogy Program used а stanine scale. Scores on psychological tests were con- 


to an approximate normal form. The grouping, 
ficiently refined for many practical purposes. 


REGRESSION TRANSFORMATIONS 


The data resulting from certain psychological experiments are comprised 


of a set of initial measurements, obtained in the absence of an experi- 


[25.3] 


SCORE TRANSFORMATIONS: NORMS 387 > 


E 
mental treatment, and a set of subsequent measurements obtained on the $ 
same subjects in the presence of an experimental treatment. These latter | 
measurements are a function both of the initial measurements and the _ 
effects of the experimental treatment. The investigator may wish to trans- | 
form the measurements obtained under the treatment to a new variable. 1 
which is independent of the initial measurements, the transformed variable 
being the object of further analysis. To illustrate, measures of motor per- | 
formance may be obtained both in the absence and the presence of a stress 4 
agent. The scores obtained under stress conditions are not independent of | 
the initial scores. A person may have a low score under stress because his | 
initial level of motor performance is low, or he may have a high score x 
because his initial level is high, quite apart from the effects of the stress | 
agent. We require a transformation that removes the effect of the initial | 
values. The variation in the transformed measurements is presumably the ң 
result of the stress agent, the effects of initial level of performance being 


removed. 
Various approaches to this problem have been used. Some investigators 


have employed difference scores, the presumption here being that the E 
increase or decrease in score over the initial value must result from the j 
experimental treatment. Other investigators have used ratio scores. These " 
methods do not achieve independence with respect to initial values, A 
straightforward approach to this problem is to remove the effects of initial 
values by simple linear regression, assuming of course that a linear-regres- 
sion model is appropriate to the data. > 

Let X, and X, be scores obtained under the two conditions. Let z; and д | 
be the corresponding standard scores. The regression equation for pre- 
dicting 21 from 2015 21 = roizo, Where ro, is the correlation between measures | 
obtained under the two conditions, and 21 is a standard score predicted. 3 
from the initial values. The values 2; are points on the regression line used | 
in predicting 2, from 2. The difference between z, and 2; is a deviation from | 
the regression line and may be written as 2) — Гол. These deviations аге | 
transformed values which are quite independent of the initial values. Thess 
effect of initial performance level has been removed. The variation in the 
transformed values results from the experimental condition plus error. Of 
course, in any practical situation the data may be contaminated by other | 
factors unless adequate controls are exercised. y 

The scores 21 — 7,2% are errors of estimation with zero mean and a stan- 
dard deviation given by V1l— roa. They may be expressed in 
standard-score form by writing 


Zi — Гоһ20 


ETE 


In this form they may be referred to as 6 scores. or delta scores. These 
transformed scores have a mean of zero and a standard deviation of unity. 


с 4 
Their skewness and kurtosis are not a simple matter. Such scores may һе _ 


25.9 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


multiplied by a constant to obtain any desired standard deviation. Any con- 
stant may be added to change the mean. 

This type of simple regression transformation is quite general and is 
applicable in many situations where we wish to remove the effects of one 
variable on another. 


TRANSFORMATIONS WITH AGE ALLOWANCES 


Any detailed consideration of a score transformation with age allowances is 
beyond the scope of this book. A few comments may, however, be appro- 
priate. This transformation is a variant of the regression transformation 
described in the previous section. Its purpose is to achieve comparability 
between children of different ages by transforming to a variable which is 
independent of chronological age. An older child A may have greater ability 
than a younger child В. Relative to his age group, however, his ability may 
be appreciably less. We require an answer to the question, how would child 
A compare with child B if both were the same age? This question is 
answered by a transformation to a variable which is independent of chrono- 
logical age. Age transformations usually incorporate a normalizing process. 
The transformed scores are normally distributed with a fixed mean and 
standard deviation. 


months, 11 years 1 month, 11 years 2 months 
ing a wider age range, 3- or 6-month intervals may be used. The n 


Thus we fit a line to all the 5th-percentile points, another line to the 
16th-percentile points, another line to the 50th-percentile points, and so оп. 
These lines describe the increase in score with increase in age at each per- 
centile-rank level. For a fairly narrow age range a sti 


100 and a standard deviation of 15. All percentile points on the 50th-per- 
centile line correspond to a score of 100 on the transformed variable. All 


| 


EXERCISES 


SCORE TRANSFORMATIONS: NORMS 389 


percentile points on the 84th-percentile line correspond to а score of 115 on 
the transformed variable. All points on the 95th-percentile line correspond 
to a score of 125 on the transformed variable. Points on the 5th- and 
16th-percentile lines correspond to scores of 75 and 85 on the transformed | 
variable. Thus for each age group we have a set of percentile points, points 
on a fitted line, and a corresponding set of transformed values. By interpo- 
lation and extrapolation a transformed value corresponding to each original 
score value may be obtained and a conversion table prepared. 

Transformed scores obtained by this general method will be approxi- | 
mately normal with a mean of 100 and a standard deviation of 15. Any other 
appropriate mean and standard deviation may be used. The transformed 
scores are independent of age. The correlation between chronological age 2 
and transformed score is about zero. 

Many variants and refinements of this general method may be applied. 
Many investigators may prefer to use a large number of percentile lines x 


and equal standard-score units. | 


———-—-—--_— ———-—_ 


State the difference between percentile points and percentile ranks. 


= 


2 For the data of Table 25.1, compute (a) percentile points P25, Ps, and 
Р» and (b) percentile ranks for scores 103, 123, and 136. 


3 For the data of Table 25.2, compute (a) percentile points P19, Р, and 
Pgo and (b) percentile ranks for the scores 59, 74, and 82. Ў 


Develop a T-score transformation for the data of Table 3.1. 


5 Develop a stanine transformation for the data of Table 3.1. 


The following are measures of motor skill under initial nonstress condi- | 


tions and subsequent stress conditions for a sample of 12 individuals. 4 


Nonstress 26 33 41 53 28 36° 44 28 52 47 59 ЭЛЕ 
Stress 18 29 52 40 25 30 38 5 4] 3 50 45. 


Apply a regression transformation to these data. What purpose would 
such a transformation serve? 


90 


PARTIAL AND MULTIPLE 
CORRELATION 


26.1 INTRODUCTION 


two variables are gathered and forms of multivariate analysis are required. 
Two forms of correlational analysis which may be applied to multivariate 


26.2 PARTIAL CORRELATION 


[26.1] 


[26.2] 


PARTIAL AND MULTIPLE CORRELATION 391 


What is meant by eliminating, or removing, the effect of a third variable? 
These terms in the present context have a precise statistical meaning. Eet 
X,, Xs, and X; be three variables. АП or part of the correlation between X, 
and X, may result because both are correlated with Хз. The reader will 
recall from previous discussion on correlation that a score on X, may be 
divided into two parts. Опе part is a score predicted from Ху. The other 
part is the residual, or error of estimate, in predicting X, from X4. These 
two parts are independent, or uncorrelated. Similarly, a score on X, may be 
divided into two parts, a part predictable from X, and a residual, ome of 
estimate, in predicting X» from Хз. The correlation between the two sets of 
residuals, or errors of estimate, in predicting X, from X; and X, from X; is 
the partial correlation coefficient. It is the part of the correlation which 
remains when the effect of the third variable is eliminated, or removed. 

The formula for calculating the partial correlation coefficient to elimi- 


nate a third variable is 


Ta Ti2 — ГазГоз 
V(1— ni) — r2) 

The notation гз means the correlation between residuals when X, has 
been removed from both X, and X2. This is sometimes called a first-order 
partial correlation coefficient. 

Let X, and X; be scores on an intelligence and a psychomotor test for a 
group of school children. Let X; be age. Let the correlation between the 
three variables be as follows: гуз = .55, гз = .60, and rs; = .50. The partial 


correlation coefficient is 


1 85—.60X.50 — 
пз = AG = .605) Ц = .507) 


Using а variance interpretation, the proportion overlap between X, and X; 
is ry? = .55° = .303. The proportion overlap with X; eliminated is r;5? = 
.36? = .127. The proportion overlap which results from the effects of age is 
.303 — .127 = .176. It would also be appropriate to state that the percent- 
age of the total association present resulting from the effect of age is 


(.176/.303)100 — 58 per cent. The remaining 42 per cent of the association 


results from other factors. 
Partial correlation may be used to remove the effect of more than one 


variable. The partial correlation between X, and Х with the effects of both 
X4 and X, removed is 


ES oma — ГазаГоза___ 
12.34 — V = ns (1 — ra.) 
(= n) (1 — та!) 
This is а second-order partial correlation coefficient. Because of difficulties 
of interpretation, partial correlation coefficients involving the elimination 


of more than one variable are infrequently calculated. 
А t test may be used to test whether a partial correlation coefficient is 


significantly different from zero. The required ¢ is 


s 
| 
я 


[26.3] 


26.3 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


Гіз.з 


ту 


This may be referred to a table of ғ with N — 3 degrees of freedom. 


MULTIPLE REGRESSION AND CORRELATION 


The correlation coefficient may be used to predict or estimate a score on an 
unknown variable from knowledge of a score on a known variable. The 
regression equation in standard-score form is 21 = ryz,, where 21 ва 
predicted or estimated standard score. In this situation we have one.depen- 
dent and one independent variable. If = 1.2 and r = -80, the best es- 
timate of an individual’s standard Score on variable 1 is zi = .80 x 1.2 = 
-96. The estimate is that the individual is .96 standard deviation units 
above the average, 


Variable 1 is the criterion, and variables 2 and 3 are the predictors. Note 
е main diagonal. In estimating stan- 
тез on 2 and 3 separately, the two 
21 =.3z5. Variable 2 is a much better 


‚ by employing a knowledge of both 2 
and 3, a better estimate of the criterion may be obtained. 


E 


' PARTIAL AND MULTIPLE CORRELATION 393 


the correlation between a standard score on 1 with the sum of standard 
scores on 2 and 3 is given by 


VAB 
In our example this becomes 


й = B+.3 1.1 
moss — УГО І.0--.5-.5 УЗ 


If we express variables 2 and 3 in standard measure, add them together, 
and correlate the sum with standard scores on the criterion, the correlation 
will be .635. This is not as good as the prediction obtained with variable 2 
taken alone. The straight sum of standard scores assigns equal weight to 
the two variables. When variables are added together directly, they are 
weighted in a manner proportional to their standard deviations. The stan- 
dard deviation of standard scores is 1. Consequently, on adding together 
standard scores, the variables are equally weighted. 

Let us select some arbitrary set of weights and observe the result. Let us 
assign weights of 4 and 1 to the two predictors. Thus one predictor will re- 
ceive four times the weight of the other. Write these weights along the top 
and to the side of the correlation table as follows: 


= .635 


The correlation of the criterion with the sum 42, + 23 is again given by 


C/VAB and is: 
3.2+.3 = 165 


"абаа 4776.0 F 1.0 + 2.0 + 2.0 


= ыы | 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


This particular arrangement of weights, 4 and 1, results in a correlation 
3 which is substantially better than that obtained with equal weights. Ob- 


taken separately. 


How may a set of weights be obtained which will maximize the correla- 
tion between the criterion and the sum of scores on the dependent vari- 
ables? Let us represent weights by the symbols В, апа В. 
standard score on 1 is then given by 2; = 
weights B, and В; such that the correlation between z, and zi 
Mathematically, the problem: reduces to 
will minimize the average sum of squares i 
rion score z, and the estimated criterion 


and Вз such that 


м У (zi —2))?— a minimum 


| The values of В» and В, аге multiple regression weights for standard 


scores. They are sometimes called beta coefficients, 


1 With three variables the values of 8, and В; are given by 
Г 
м = 712 lle 
| у [26.4] в, = Wee 
, = Гіз loros 
[26.5] f, FIERE ra 
In the above example 
| _ 8- 3x5 
У В» 1-5 = 867 
Р 3—.8x 
E В: = І: 52 =“ —-133 
| 
p 


Let us write these weights above and to the si 


1 ide of the correlation table and 
multiply the rows and columns as follows: 


+867 —1.33 


[26.6] 


26.4 


[26.7] Х 


` 
PARTIAL AND MULTIPLE CORRELATION 395 
* 


The correlation between the criterion and the weighted sum is 
CIVAB = .654/V..654 = V.654 = .809 


This is a multiple correlation coefficient and may be denoted by К. №. 
other system of weights will yield a higher correlation between the criterion’ 
and the weighted sum of predictors. ] 

Note that the sum of elements in the top right quadrant of the weighted 
correlation table is equal to the sum in ‘the lower right, or C = B. This cir- 
cumstance will occur if the weights used are multiple regression weights. 1 
provides a check on the calculation. We note also that А2 = C and К 
VC. Thus the multiple correlation coefficient may be obtained by the 
formula - 


R = V Baria + Batis : 


This is the commonly used formula for calculating a multiple correlation | 


coefficient. 
In our example the multiple correlation is .809. The correlation of 24 


able 2 with the criterion is .8. The addition of the third variable increases | 
prediction very slightly. In a practical situation the third variable could 
safely be discarded as contributing a negligible amount to the efficacy o 
prediction. 


THE REGRESSION EQUATION FOR RAW SCORES қ 


The equation 21 = ez» + Baz; is а regression equation in standard-score 
form. It will yield the best possible linear prediction of a standard score о 
1 from standard scores on 2 and 3. In practice, we usually require a regres 
sion equation for predicting a raw score on 1 from а raw score оп 2 and 3. 
Let X; be а predicted raw score on 1, and X, and X; the obtained raw score 
on 2 and 3. The estimated standard score z; and the observed standar 
scores 2; and 23 may be written as 


By substituting these values in the regression equation in standard-sco 


form we obtain 


—Х, zd =, 
51 IS p, X + в X = 
Rearranging terms САВ writing the expression explicit for X; yields 
5 
= Batt Xa t B № + (Xi Beg P 9 


Thisisa oe ee in raw-score form. It may be used to predict a 


_ 396 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


raw score on 1 from a raw score on 2 and 3. The values Bssilss and 551/53 
act as weights. The quantity to the right in parentheses is a constant. 

In the example of the previous section В, = .867 and Вз =—.133. Let us 
assume that s, — 5, s, — 10, s, 20; also X, = 20, X, = 40, and X, — 60. 
The regression equation in raw-score form is written as 


Xi = (867) № X,+ (—.133) % X, 


+ [20 — (.867) 55 (40) — (—.133) 5 (60)] 
= .434Х, — .033X, + 4.62 


26.5 THE GEOMETRY ОЕ MULTIPLE REGRESSION 


Fig. 26.1 


Given two variables X, and X,, each pair of observations may be plotted as 
а point on a plane. If interest resides in predicting one variable from a 
knowledge of another, a straight regression line may be fitted to the points 
and this line used for prediction purposes. 

Given three variables Х,, X», and X4, each triplet of observations may be 
plotted as a point in a space of three dimensions as shown іп Fig, 26.1. In- 
stead of two axes at right angles to each other, we now have three. All 


Geometrical representation of multiple regression, 
ABCD is a multiple regression plane. 


26.6 


[26.8] 


[26.9] 


[26.10] 


[26.11] 


PARTIAL AND MULTIPLE CORRELATION 397 


by ABCD. With two variables the regression equation is the equation for a 
straight line and is of the type Xj = Ь„Х» + a, where б» is the slope of the 
line and a is the point where the line intercepts the X, axis. With three vari- 
ables the regression equation is the equation for a plane and is of the type 
Xi = bX; + bX; + a. Here Б, is the slope of the line AD in Fig. 26.1 and b, 
is the slope of the line AB. The constant a is the point where the plane in- 
tercepts the X, axis. In Fig. 26.1 it is the distance 40. 

Consider now a particular individual. Represent his score on X; by OE - 
and on X; by OF. We locate the point С in the plane of Х and X; and 
proceed upward until we reach the point Н in the regression plane ABCD. 
The distance GH is the best estimate of the individual's score on X, given 
his scores on X; and Хз. It is the best estimate in the sense that the regres- 
sion plane is so located as to minimize the sums of squares of deviations 
from it parallel to the X, axis. 

The reader wil! observe that the three-variable case is a simple extension 
of the two-variable case. A plane is used instead of a straight line. With 
four or more variables the idea is essentially the same. With four variables, 
in effect, we plot points in a space of four dimensions and fit a three-dimen- 
sional hyperplane to these points. By increasing the number of variables 
we may complicate the arithmetic. We do not complicate the idea. 


MORE THAN THREE VARIABLES 


In the discussion above we have considered the multiple regression case 
with three variables only, one criterion and two predictors. With А vari- 
ables the multiple regression equation in standard-score form is 


21 = Boze + Baza + ` © © + ық 


The raw-score form of this equation may be obtained, as previously, by 
substituting for the values of z; the values (X; — X;)/s; and rearranging 


terms. We thereby obtain 


Xie В, ЕХ, + зс А ++ BENA 


2 


where А is given Бу 


y $1 2 51 Ru 5 y 
A=X,— В m X, Bs 5з X Br Sk X. 


The multiple correlation coefficient is given by 


R = УВ + Batis +" 5" + Вик 


Thus to calculate this coefficient we multiply each correlation of a predic- 


tor with the criterion by its corresponding regression coefficient, sum these 


products, and take the square root. - | т 
А number of computational procedures exist for calculating the required 


| 
| 


26.7 


Table 26.1 


Table 26.2 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


regression weights with more than three variables. The method described 
here originates with Aitken (1937) and has been called the method of piv- 


tion of the Aitken method which reduces the amount of calculation has 
been developed by DuBois (1965). 


AITKEN’S NUMERICAL SOLUTION 


To illustrate the application of Aitken’s method let us consider a problem 
with five variables, one criterion and four predictors. Denote the criterion 
by X, and the predictors by X2, Хз, Х,, and Ху. The criterion may be 
regarded as a measure of success in an occupation, and the predictors may 
be psychological tests used to predict performance т the occupation. 

The intercorrelations between the five variables are shown in Table 26.1. 
The means and standard deviations of the five variables are shown in Table 
26.2. Table 26.3 shows the procedure for calculating the i 


а b 
са 


the difference between cross products is ad — cb. In this case the cell 
value a is the Pivotal element. 


Correlation coefficients between a criterion and four predictors 


X, X X X, X, 
X, 1.00 72 58 41 -63 
X, 72 1.00 69 49 -39 
X, 58 -69 1.00 -38 .19 
X, 41 -49 .38 1.00 27 
X, -63 -39 19 27 1 


Means and standard deviations for criterion 
and four predictors 


Aitken’s method for computing regression coefficients* 


(1) -69 49 339 =l x * . 1.57 
69 1 .38 110. i =i * * 1.26 


72 .58 41 .63 : Б А > 2.94. 


(1.908) (524) .042 —.079 


1.000 .080 —.151 
.042 -760 .079 
—.079 .079 .848 
.083 .057  .349 


7 (157) .085 З 
100 112 55 206 131 - 4 
1085 .836  .494 —151  - -1 
.050 362 61 21580. 
COGI feni EROR) 
1.000 
356 


* Example from Godfrey H. Thomson, The factorial analysis of human ability, 5th ed., University of London 
Press Ltd., London, 1951. E 


Regression coefficients 


The steps in the calculation are as follows: 


Write down the matrix of intercorrelations between the predictors, that- 
is, between variables Хз, Хз, Ха, and Xs. Insert T’s along the diagonal. 
Beneath this matrix write a row containing the correlations of the 
predictors with the criterion. The resulting matrix is shown to the lel t 
of slab А in Table 26.3. Қ 
To the right of the above matrix record another matrix with —1’s down 
the diagonal. All other elements are zero, including those in the bottom 
row. In Table 26.3 a dot represents a zero. 

Sum the rows to obtain the values in the check column. 


following product differences are formed: 


1X 1— .69 х .69 = .524 1х (-1) -0x .69 = —1 
1 х .38 — .49 х .69=.042 1x0—0x.69—0 

1х 19 — .39 х .69 = .079 1x0—0x.69—0 
1х0— (—1) х .69 = .690 


_ 400 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


These values are recorded in the first row of slab B. The check value is 
obtained by forming the product difference 1 x 1.26 — 1.57 X .69 = 
-177. If the calculation is correct to this point, the sum of elements in 
the first row of slab B will equal the product difference .177. 

5 Beneath the first row of slab В write a second version of it obtained by 
dividing each element by the top left element, .524. The result is a row 
with unity as the pivot. This assists subsequent calculation. This part 
of the procedure is most readily accomplished by multiplying the ele- 
ments in the row by the reciprocal of .524, or by 1.908. 

6 The remaining elements in slab B are obtained by forming product dif- 
ferences using the first row of slab А with the third, fourth, and fifth 
rows of slab A, successively, always using the 1 in the top left cell as 
the pivotal element. Thus 


1X .38 — .69 x .49 = .042 
1x 1—.49 х .49 = .760 


and so on. Each row is summed to provide a check on the calculation. 
The result is a reduction of the original 5 X 4 matrix of slab A to the 4 X 
3 matrix of slab B. 

7 Тһе procedure is now repeated to obtain slabs C, D, and E. At each 
stage, with the exception of the last, the top row in each slab is divided 
by the left-hand cell value, or multiplied by the reciprocal of that value, 
to obtain a second version of the top row. The appropriate reciprocal 
for row C is 1.321, and for row D it is 1.211. 


8 By proceeding with the calculation, the original matrix is condensed to 
the cell values in slab E. These four values are the multiple regression 
coefficients for predicting a standard score on the criterion from stan- 
dard scores on the four predictors. 


In this example the regression equation for predicting the criterion from 
the predictors in standard-score form is 21 = .390z, + +2222, + .018z, + 
-431z;. No other system of weights will provide a better estimate of the cri- 
terion. The correlations of the four predictors w 


2 .58 41 -63 
By multiplying these by the corresponding regression coefficients, sum- 


ming the resulting products, and taking the Square root, we obtain the mul- 
tiple correlation coefficient as follows: 


ith the criterion are 


К = V.390 x .72 + 222 X .58 } -018 x 41+ 431 x .63 = .83 


A multiple correlation coefficient is amenable to t 
interpretation as any other correlation coefficient, 
tween a criterion variable and the weighted sum 
predictors being weighted in order to maximize tha 
To obtain a multiple regression equation in raw- 


he same general type of 
It is the correlation be- 
of the predictors, the 
t correlation. 

Score form we require 


PARTIAL AND MULTIPLE CORRELATION 401 
the means and standard deviations of Table 26.2. We may write Қ 


5.68 5.68 


X; = (.390) 1571 X, + (.222) 9.92 Хз 
5.68 5.68 
cae 2:98 Й 5.05. 
(.018) 6.32 X, + (.431) 14.09 Xs+4 - 
The constant А is given by 
4. = 5.68 5.68 
А = 8.72 — (.390) 1571 (104.65) — (.222) 9.92 (43.22) 
5.68 5.68 
(.018) 6.32 (14.98) — (.431) 14.09 (87.22) 26.81 


With any substantial number of variables the calculation of multiple | 
regression weights is clearly a laborious procedure and requires the use of 


modern computing devices. 


26.8 THE SIGNIFICANCE OF A MULTIPLE 
CORRELATION COEFFICIENT 
An F ratio may be used to test whether an observed multiple correlation co- 
efficient is significantly different from zero. The required value of F is given 


by the formula 


R N-k-1 
[26.12] ЕЕ ТЕТЕ 


where R = multiple correlation coefficient 
М = number of observations 
k= number of independent variables or predictors 


The table of F is entered with df, = k and df; = N — k — 1. 


26.9 SHRINKAGE IN MULTIPLE CORRELATION 


The multiple correlation coefficient is a measure of the efficacy of predic- 
tion for a particular sample. It is not, however, an unbiased estimate of the 
population correlation coefficient. Also, if the multiple regression weights 
calculated on one sample are applied to a second sample, the correlation 
between the weighted predictors and the criterion in the second sample ' 
will be less than the multiple correlation originally calculated on the first 
sample. The bias in the multiple correlation results because the process of 
determining regression weights, which minimize the average squared 
error, takes advantage, as it were, of the idiosyncracies of the sample. The 
extent of the bias depends on the population values of the multiple correla- 
tion coefficient, the sample size, and the number of predictor variables. 
The phenomenon under discussion here is commonly referred to as shrink- 
age. One estimate of the population value of the squared multiple correla- 


ы & 

402 PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 
tion coefficient in the population is given by the formula 

[26.13] 


€ extent of the bias in R?, аз an es- 
decreases with increase in 


that this formula Provides quite а 
lation multiple correlation, 


known as the Cross-validity and is denoted by re. The quantity r, is gener- 


Population multiple correlation. The bias 
Size and increase with the number of pre- 


26.10 SOME OBSERVATIONS ON MULTIPI E 


shown equal to R? = B,? + 

comprised of three additive 

i d 2. Вз? а Contribution by Ху, and the 

қ term 283г, is а шүн: which involves the correlation ее X. 

ашап E 
variables is not a тта matte, 186 тегіне Contributions of the different 
н; тесі сот т А пі- 
f the г қ Parison of the relative mag 

PERE оп Ee lod coeficients but requires al ideration of the 
correlation terms, SO à consideration 

Frequently, in Practica] Work, t 


can be attributed toa relati | һе Steater part of the prediction achieved 
сме small пи ег of variables, perhaps four or 


ix, and the j i +42 
five or six, € inclusion of additional Variables Contributes only small 


and diminishing amounts to Prediction 4 
T on. : . 
applied to decide whether or not the Ai d se cignificance png 


: H ACE 5, а 
subset of variables wil] Significant] йур Ja ог more variables to 
Investigators concerned with Prediction, 


Problems of icti t 
р Ая ; Prediction frequent attemp 
to identify independent Variables Which Show а high ть the 


EXERCISES 


7 PARTIAL AND MULTIPLE CORRELATION 403 


criterion and a low correlation with each other. If two variables have a 
fairly high correlation with the criterion and a low correlation with each | 
other, both measure different aspects of the criterion and both will contrib- - 
ute substantially to prediction. If two variables have a high correlation with. 
each other, they are measures of much the same thing, and the inclusion of. 
both, instead of either one or the other, will contribute little to the predic- 
tion achieved. 


1 Given the correlations rj; = .70, гүз = .50, апа Гау = .60, compute ri 
What percentage of the association between variables 1 and 2 results” 
because of the effect of variable 3? + 


2 Тһе mean and standard deviation of a criterion variable аге X, = 24.56. 
and s, = 4.52. The means and standard deviations for two predictor 
variables are Х = 36.48, X, = 16.95, and s, = 5.49, s, = 3.66. The cor- 
relations are rz = .70, гу = .65, and ra = .33. Compute (a) the corre- 
lation between standard scores on the criterion and the sum of stan- 
dard scores on the two predictors, (5) the correlation between raw 
scores on the criterion and the sum of raw scores on the two predictors, 
(c) the multiple regression equation in standard-score form, (4) the 
multiple regression equation in raw-score form, (e) the multiple corre- 
lation coefficient. 


Р 
3 The following are intercorrelations between first-year university 


averages and five university entrance examinations. Means and stan 
dard deviations are also given: 3 


x, .09 49 ET .50 .20 1.00 71.80 4.45 


Compute (a) the multiple regression equation in standard-score form, | 
(b) the multiple regression equation in raw-score form, (c) the mul- 
tiple correlation coefficient, (4) the multiple correlation coefficients 
obtained by successively dropping variables 6, 5, and 4. 


404 


27.1 


AN INTRODUCTION TO FACTOR ANALYSIS 


INTRODUCTION 


Factor analysis is a multivariate statistical method which is used in the 
analysis of tables, or matrices, of correlation coefficients. These coeffi- 
cients are usually, although not necessarily, product-moment correlation 
coefficients. In most applications of factor analysis the variables are psy- 
chological tests. The method is, however, quite general and can be applied 
to correlations between variables of any type, e.g., economic, physiological, 
meteorological, or physical. Direct inspection of any large matrix of corre- 
lation coefficients indicates immediately that no simple intuitive interpreta- 
tion of the pattern of interrelations between the variables is possible. The 
method of factor analysis reduces the original set of variables to a smaller 
number of variables, called factors, which are amenable to interpretation. 
The information which a complex pattern of interrelations contains can, 
thereby, be understood, In multiple regression a distinction is made be- 
tween a dependent variable, or variables, and a set of independent vari- 
ables. Factor analysis is usually applied to data where no distinction be- 
tween dependent and independent variables is possible. Concern is with 
the study of interdependencies and the discovery of structure among inter- 
dependencies. 

Factor analysis as a method originated with C. Spearman. In a paper on 
the theory of intelligence, published in 1904, Spearman argued that “all 
branches of intellectual activity have in common one fundamental func- 
tion (or group of functions), whereas the remaining or specific elements of 
the activity seem in every case to be wholly different from that in all 
others." Spearman analyzed tables of intercorrelation between psycho- 
logical tests and purportedly was able to show that the intercorrelations 


21.2 


AN INTRODUCTION TO FACTOR ANALYSIS 405 


could be accounted for in terms of one general factor, common to all tests, 
and factors which were specific, or unique to each test. This was known as 
the theory of two factors. The general factor was referred to аз g. It is of in- 
terest to note that Spearman was primarily concerned with psychological 
theory and viewed the factor analysis of tables of intercorrelations as 
providing evidence to support his theory. Spearman’s theory had neuro- 
physiological aspects. He linked his general factor with energy, which was 
thought to serve the whole cortex or nervous system, and his specific fac- 
tors with particular groups of neurons, which served particular kinds of 
operations. 

Following Spearman, much work on factor analysis was done by Cyril 
Burt, Godfrey H. Thomson, Karl J. Holzinger, L. L. Thurstone, and others. 
The theory of two factors was the object of much criticism and was rapidly | 
discarded, although the concept of a general factor common to many types : 
of intellectual activity still persists. L. L. Thurstone in effect generalized 
the factor analytic method and invented multiple-factor analysis, which 
could be applied to data involving any number of factors. 

Probably the most significant influence on the development of the factor 
analytic method, and its application, over the past fifteen years has been 
the electronic computer. Prior to the availability of the electronic computer 
the completion of a factor analysis was arithmetically very laborious. Many 
problems in factor analysis could not be properly investigated because of. 
the enormous amount of computation required. The computer has not only 
extended the applications of factor analysis, but has led to the solution of. 
problems which were hitherto arithmetically impractical. 

The purpose of the present chapter is to explain in a simple way the na- 
ture of the ideas involved in factor analysis. No attempt is made to instruct 
the reader in computational procedures, since all such computation is now _ 
done by computer. The student who wishes to pursue the study of factor 
analysis in greater depth is referred to the comprehensive and up-to-date | 
treatment of the subject by Harry H. Harman (1967). The present discus- 
sion follows Harman's statistical notation. This may assist the reader who 
wishes to pursue the topic beyond the elementary introduction presented | 
here. 


AN ILLUSTRATIVE ANALOGY 


Some aspects of the nature of factor analysis can be readily understood by 
considering a simple analogy. Consider five urns, each one containing à 
certain proportion of red balls and a proportion of balls of another color. 
The proportions are as shown in Table 27.1. Most people would agree that 
Table 27.1 provides a convenient way of describing the composition of the 
balls in the five urns with respect to color. м 
Let из now suppose that samples consisting of pairs of balls are drawn 
from all possible pairs of urns in turn, and the number of agreements 
counted. For example, a ball is drawn from the first urn, another from the 


406 


Table 27.1 


Table 27.2 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 
Proportion of balls of different colors in five urns 


Urn Red Orange Yellow Green Blue Purple 


1 .50 .50 


2 .60 40 

3 -70 -30 

4 -80 .20 

5 90 10 


second. If the two balls are the same with respect to color, this is an 
agreement. This process is continued until samples are drawn from all pos- 
sible pairs of urns. We note, however, that the probability of drawing a red 
ball from the first urn is .50, the probability of drawing a red ball from the 
second urn is .60; consequently, using the multiplication theorem of proba- 
bility, the expectation is that the proportion of agreements in drawing 
samples of pairs from the first and second urns is .50 X .60 = .30. 
Similarly, the expected proportion of agreements in drawing samples of 
pairs from the first and third urns is .50 x .70 = .35, from the first and 
fourth urns is .50 x .80 = -40, and so on. The expected proportion of 
agreements in drawing samples of pairs from all pairs of urns is given in 
Table 27.2. 

Let us now suppose that we are given only the information contained in 
Table 27.2. Is it possible to reconstruct the description of the urns con- 
tained in Table 27.1? This is directly analogous to the factor analytic 
problem. In factor analysis we are presented initially with a table of inter- 
correlations which is the analogue of the proportions of Table 27.2. These 
intercorrelations are measures, as it were, of what all possible pairs of vari- 
ables have in common. Factor analysis enables us to derive from the table 
of intercorrelations a description of the variables which is analogous to the 
description of the composition of the urns shown in Table 27.1. How may 
the description as shown in Table 27.1 be obtained from the information 
presented in Table 27.2? The proportion of red balls in each of the five urns 


Expected proportion of agreements in drawing pairs of. 
balls from all possible pairs of urns 


1 2 3 4 5 
————— dá ee Ple fe 2 
1 
2 .30 
3 35 42 
4 40 48 56 t 
5 45 54 63 72 


[27.1] 


[27.2] 


27.3 


AN INTRODUCTION TO FACTOR ANALYSIS 407 


may be denoted by ру, ps. рз, ра, and ps. The proportion of orange, yellow, | 
green, blue, and purple balls may be denoted by 41. 42. 43, qa, and qs. Like- " 
wise the proportion of agreements in Table 27.2 may be represented by ра», _ 


Piss . - - + Das. Also, we have the relations 
PiP2 = Раз 
PiPs = Pis 
РаРа = P23 
РзР5 = Pas 


И we multiply p, pz = Piz Бу pipa = Риз, divide by paps = pas, and take the 
square root, we obtain 


DE Pi2Pi3 у 
: P23 E 


Similarly we can solve for ps, pa, Pa, and ps. In general 


= |РаРік 
RE xx Юж 


This means that, given only the data in Table 27.2, the description of Table | 
27.1 can be obtained. For example, 
30 x .35 


и=ү“ > =.50 


Similarly рз = .60, рз = .70, p, = .80, and p, = .90. 

Two points which relate to the present analogy should be noted. Firs 
the statistical model used in obtaining the description of Table 27.1 from 
the information in Table 27.2 assumes that there is one factor only in com- 
mon. While this model is clearly appropriate here, models involving othe; 
assumptions might be considered. Second, no information exists in Table 
27.2 which would make possible an identification of factors. While the 
proportions in Table 27.1 can be obtained from Table 27.2, the proceduri 
whereby this is done does not indicate that the common factor is red and 
the specific factors are orange, yellow, and so on. The procedure indicates. 
only the presence of a factor common to all variables and factors specific 
each variable. In any application of the factor analytic method the situati 
is the same. The analysis provides no information whereby the nature 


depends on considerations that are apart from the statistical analysis used. 


BASIC EQUATIONS 


The basic factor analytic model is that a score of individual i on variable j q 
can be conceptualized as the weighted sum of scores on a smaller number . 


408 


[27.3] 


[27.4] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


of derived variables, called factors. This is a linear model, and may be 
expressed in standard-score form as follows: 


тазалы азы + > + smh ing + айу 


The quantity ту is the standard score of individual ion variable j. Fy; is the 
standard score of individual i on the first common factor; Ёз is his standard 
score on the second common factor; and Ри; is his standard score on the 
mth common factor. The quantity U; is the standard score of individual i 
on what is called a unique factor; that is, a factor that is involved in a single 
variable only, in this case the variable j. The coefficients ал, ар, ... ат 
are factor loadings. These are weights which attach to the common-factor 
scores. The coefficient djis the weight which attaches to the unique-factor 
score. Factor analysis is concerned primarily with the determination of the 
coefficients, or loadings, ал, aj, . . . , ау. It is not usually concerned 
with estimating the factor scores Ел, although methods for such estimation 
exist. 

The analogic relation between equation [27.3] and an ordinary multiple 
regression equation is of interest. A multiple regression equation in stan- 
dard-score form may be written 


21 = Baza + Baza + + - + В 


where 2! is a predicted standard score on the dependent variable; 25, 
25... , Zm are the standard scores on the independent variables; and 
В», Вз, . . . , Bm are weights so determined as to maximize the correla- 
tion between the dependent variable and the weighted sum of scores on the 
independent variables. What are the points of similarity and difference be- 
tween a multiple regression equation in [27.4] and the factor analytic 
equation in [27.3]? First, the multiple Tegression equation states that a 


е conceptualized as the 
ther variables, called fac- 
5, В, and the factor loadings, 
ever, a distinction is made be- 


concerned with prediction, and equation 
Factor analysis is concerned with the discovery and description of struc- 
ture in complex patterns of relations. Third, itis worth noting that in a mul- 
tiple regression equation the independent variables 2» 2)... , Zm аге 


[27.5] 


[27.6] 


27.4 


AN INTRODUCTION TO FACTOR ANALYSIS 409 | 


usually not independent of each other, but are correlated. In the factor 
analytic equation, with some sets of data the factors are defined in such a 
way as to be independent or uncorrelated. This means that the factor 
scores Fu, Fo . . . , Fmi are uncorrelated with each other. With other sets 
of data correlated factors may be used. The F; are not independent but are 
correlated. When the factors are uncorrelated, they are said to be 
orthogonal. А method that leads to а set of uncorrelated factors is spoken 
of as an orthogonal solution. When the factors are correlated, they are said 
to be oblique. А method that leads to a set of correlated factors is spoken of 
as an oblique solution. 


Equation [27.3] expresses a standard score z as a linear function of — 


weighted-factor scores. Alternatively, the model may be written in a form 
which expresses the variable z; as a linear function of weighted factors, the 
factors being viewed as variables. Thus 


д= aj Fy + аз» ++ + aimFm + 0, 


Here the ау, ар, . . . , аз are factor loadings and F,, Fy, ... , Fm are 
the factors as variables. For a set of n variables this model may be written 


as 


21 = ay Pp а»Ёь 
AF + аз + + >> + азтЕт + 420» 
Zn = аш + авЁ; + +++ + атЁт а,0, 


Such an arrangement is called a factor pattern. Usually a factor pattern is 
written with the factor loadings recorded in columns with the number of 
common factors shown at the top of the column. To illustrate, the following 
is a factor pattern for three factors and five variables. The value А2 is 


Factors 
Variables I п ПІ h? 
1 ay [m [m h? 
2 аз а аз hg 
3 аз аз 033 hj 
4 ац аз аз he 
5 а ауз аз hs 


known as the communality, and is the sum of the squares of the common- 
factor loadings. The communality is discussed in the section to follow. 


COMPONENTS OF VARIANCE 

Equation [27.3] is in standard-score form. Thus the scores 2; and the fac- 
tor scores F; have-zero mean and unit variance. By squaring both sides of 
equation [27.3], summing over N cases, dividing by N, and assuming that 


+. ҒатҒа + 4,0, x 


[27.7] 


[27.8] 


27.5 


[27.9] 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 
factor scores are uncorrelated, it can be readily shown that 
57 = 1 = ay? + ayy? + рауга? 


This equation states that the total variance of a test j may be partitioned 
into additive parts. The component of variance due to the first factor is ал”, 
that due to the second factor is аз?, and so on. The ал are correlation coef- 
ficients between the variables and the factors. Using the variance interpre- 
tation of the correlation coefficient, the aj? may be interpreted as the 
proportion of the total variance that may be attributed to the different fac- 
tors. 

In the terminology of factor analysis different components of variance 
have been assigned different names. The communality of a variable, 
usually represented by the symbol A, is the sum of the squares of the 
common-fector loadings, a common factor being one that is common to 
more than one variable in the set. Thus 


Арта ай... ат? 


Тһе communality is that part of the variance that may be attributed to com- 
mon factors. The part of the variance that is left over, and that cannot be 
attributed to common factors, is called the uniqueness and is represented 
by the symbol d?. The uniqueness is sometimes partitioned into two com- 
ponents, specificity, БР, and error variance, ef. The specificity is that part 
of the total variance which is due to factors that are specific to the particu- 
lar variable and not due to measurement error. Since all measurement in- 
volves error in some degree, in all applications of factor analysis part of the 
uniqueness will be due to measurement error. If the reliability coefficient 
for a particular variable is known, the error variance may be readily 
calculated from the formula ef = 1 —гу, where ry is a reliability coefficient 
for test j. The quantity e? is nothing other than the error variance as- 
sociated with a single test score in standard-score form. 


REPRODUCING THE CORRELATION COEFFICIENTS 


; : stated in different terms, how 
good a fit is the linear model to the observed correlations? 


standard-score form is given by 


М 
У ЕЛЕСІ 


ут 


We may write the following: 


[27.10] 


[27.11] 


27.6 


[27.12] 


AN INTRODUCTION TO FACTOR ANALYSIS 41 


21 = алРи + аЕы + + © © ЧатЁЕт djUs 
Zki = аи + agoFoi t + ++ agsF mi + Уы 


These two equations, which are also in standard-score form, may be mul 
tiplied together, summed over V individuals, and divided by № — 1. The 
common factors are uncorrelated with each other. The unique factors are 
uncorrelated with each other and with the common factors. Given these 
simplifying assumptions, the following result is obtained: 


гу: = адак + арак» + + + + + атак 


This equation, which applies only to uncorrelated factors, shows how the 
correlation coefficient between two variables may be reproduced from the 
factor loadings. The quantity адау is viewed as the contribution of the first. 
factor to the correlation, аза» as the contribution of the second factor to 
the correlation, and so on. We have used г), to refer to the reproduced cor- 
relation coefficient to distinguish it from rj; the observed correlation. The 
reproduced correlation coefficient between two variables is the sum of the. 
products of the common-factor loadings. With real data, because. of 
sampling or other kinds of error, the correlation reproduced from the тас 
loadings will differ in some degree from the observed correlations. The dif- - 
ferences between the observed and the reproduced correlations are known 
as residual correlations, Pj = гу: — rj. The magnitude of the residual 
correlations indicates the extent to which the factors account for the ob- | 
served correlations, or how good a fit the linear model is to the observed. 
correlations. 


THE GEOMETRY OF FACTOR ANALYSIS 


Many problems in factor analysis are illuminated by conceptualizing them 
in geometrical terms. We begin by considering the geometrical represent- | 
ation of the correlation coefficient. Variable j may be represented by a vec- 
tor of length hj, and variable Ё by a vector of length hy. Denote the angle 
between the two vectors by фу. It can be shown that the correlation coeffi- 
cient between two variables is equal to the length of the two vectors mul- 
tiplied by the angular separation between them; that is, 


rj Му cos фу; 


For a simple proof of this result see Thurstone (1947). If the vectors are of. 
unit length, h; = A; = 1, and the correlation is simply the cosine of the angle | 
between the vectors, Fig. 27.1 shows the geometrical representation of "o: 
relation coefficients of different magnitudes. г 1 
For vectors of unit length, a correlation of zero is represented by two vec- | 

А at right angles to each other, a correlation of .707 by two vectors at а 

5° angle to each other, and a correlation of —.500 by two vectors with an 
isis separation of 120°. Since the cosine of a zero angle is 1.00, а corre- - 
lation of 1.00 is represented by two vectors that coincide. 


412 


Fig. 27.1 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


i 
1 
1 
120° 
20" ase 
k k k 
rj, 7.000 


rj 7 0.707 rj. = —0.500 
(a) (b) (e) 


Geometrical representation of correlation coefficients. 


The geometrical representation of tables of three variables may be con- 


sidered. The following are three tables of correlations between three vari- 
ables. 


) 2 3 12 в 1 2 3 

1 1 1 

2 | .000 2 |.707 2 | .000 

3 | .000 .000 з | .707 .707 “3 | .707 .500 
(a) (b) 


(с) 


The geometrical representations of these three tables of correlations, as- 
suming a vector of unit length, are shown in Fig. 27.2. 

In Fig. 27.2а all correlations are zero and the three vectors are at right 
angles to each other. In Fig. 27.2b the three correlations are .707 and the 
three vectors are at a 45? angle to each other. In Fig. 27.2c the three vectors 
have different angular separations. The examples of Fig, 27.2 must be 
viewed as being in a space of three dimensions. 


2 


(а) (b) 


(с) 


Fig. 27.2 Geometrical representation of three tables of correlation coefficients, 


AN INTRODUCTION TO FACTOR ANALYSIS 413 


(a) e) 


Fig. 27.3 Configuration of vectors shown without and with reference axes. 


The geometrical representation of a table of correlations of more than 
three variables is a simple extension of the above. The vector model isa _ 
sheaf of vectors of varying lengths and angular separations. The lengths of _ 
the vectors, hy, Аз, . . . , hm, аге the square roots of the communalities, 
The sheaf of vectors is commonly known as a configuration. A vector 
model for any substantial number of variables must ordinarily be concep- | 
tualized as existing in a multidimensional space. 

Geometrically the factorization of a table of correlations involves insert- 
ing a set of references axes in the configuration of vectors and describing = 
the terminal points of the vectors in relation to the reference axes. The j 
projections of the terminal points of the vectors on the reference axes are = 
the factor loadings. If the factors are uncorrelated, the reference axes аге | 
at right angles to each other. To illustrate, the following is a table of corre- — | 
lations between five variables. | 


1 2 3 4 5 
лы 2-і жәштае ЕВ e 
1 
2 65 
3 51 AT р 
4 33 33 51 ; 
5 a7 23 53 51 
------------------------------- a 


The configuration of vectors corresponding to this table of correlations is 
shown in Fig. 27.3a. Figure 27.35 shows two orthogonal reference axes in- 
serted in the configuration of vectors. The projections of the terminal 
points of the vectors on the reference axes are the factor loadings for the 
five variables. These are as follows: 


21.1 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


І п a 
АЕ 2-24 
1 -90 10 82 
2 10 20 453 
3 -50 -60 6l 
4 -30 -60 45 
5 10 80 65 


The communalities, which are the sums of the squares of the factor load- 
ings, are shown. The square roots of the communalities correspond to the 
lengths of the vectors shown in Fig. 27.3a. 

A variety of methods exist for inserting reference axes in a configuration 
of vectors and thereby arriving at a factor matrix which reproduces approx- 
imately the correlation matrix. Quite clearly an indefinitely large number of 
solutions to this problem exist, since the reference axes could be located 
anywhere at all. In Fig. 27.35 the location of the reference axes їз arbitrary. 
The reference axes could have been located elsewhere and a different fac- 


THE COMMUNALITIES 


A factor analysis begins with a matrix of correlations, What are the appro- 
priate values to insert as diagonal elements in the matrix of correlations? 
Most factor analytic methods assume that the appropriate diagonal ele- 


the sums of the squares of the common- 


27.8 


[27.13] 


AN INTRODUCTION TO FACTOR ANALYSIS ET 3. 


able. One of the best estimates of Ше communality. which has much to 
commend it, is the square multiple correlation, В}, of every variable withy 
each of the remaining variables. The squared multiple correlations, R,°, 
Rê, . . . , Rẹ, of each variable with the n — 1 remaining variables can | 
readily be obtained with an electronic computer. , 


PRINCIPAL-FACTOR SOLUTION 


The most commonly used direct method of factoring in use at the present. 
time is the principal-factor solution. This method was developed by Но- | 
telling in 1933. Because of the laborious nature of the arithmetical calcul. 
tions involved, it was not widely used until electronic computers became | 
available. 
The geometrical model for the principal-factor solution is of interest 
Measurements on two variables for a sample of N members can be plotted 
as points in relation to two orthogonal reference axes. The result is the 
usual scatter diagram. If a correlation exists between the two variables, the 
arrangement of points will be elliposidal in form. Given more than two vari 
ables, say n, the points may be plotted with reference to n reference axes, 
The result is an elliposidal swarm of points. The principal factors corre- ) 
spond to the principal axes of this ellipsoid. Questions may be raised 
regarding the dimensionality of the ellipsoid. If 1’s are inserted in the diag- 
onals of the correlation matrix prior to factoring, the ellipsoid will have п 
principal axes, and n components will result. If estimates of the com- 
munalities are placed in the diagonals, m factors will result, where m is or- 
dinarily less than n. ) 
The identification of the principal axes of an ellipsoidal swarm of points 
can be shown to reduce algebraically to the identification of а set of facto 
in decreasing order of their contribition to the total communality, assumin 


lation matrix. 
The brief discussion here attempts to present the algebraic rationale ofa 


of a factor F, which maximizes the quantity 
Vi=a," Hage + + + tan? 


where И, is the contribution of F, to the total communality. The maximiza- 
tion of V, is subject to the condition that the sum of the products of the fa с. 
tor loadings will reproduce the correlation coefficients. Following the 
calculation of F, a matrix of first-factor residuals is obtained. The contrib v 
tion of the first factor to гә is апаз. and the residual correlation is rise 
аа». In general, the first-factor residual correlations are given by 
Tik — алак. A table of residual correlations results. This is a table of cor 
relations with the influence of the first factor removed and with residual 


[27.14] 


27.9 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


communalities in the ‘diagonal. The second factor, F,, is obtained by 
maximizing 


Vo = аз? + as? + а 


where V, is the contribution of the second factor to the соттипаШу. Again 
a table of residuals is obtained which shows those parts of the correlation 


cussed. Another direct method of factoring which has been very widely 
used is the centroid solution, developed by L. L, Thurstone. Man 
amples of factor analysis reported in the literature use the centroid solu- 
tion, The centroid solution involves placing the first reference axis through 
the centroid of the configuration of vectors: obtaining a table of residual 
correlations, which are subject to certain adjustments; placing the second 
factor through the centroid corresponding to the table of residual correla- 
tions; and continuing the process until the magnitude of the residuals can 
be considered inconsequential. The centroid solution is an ap 
to the principal-axes solution, and was used Prior to the general availability 


е maximum- 
Lawley and 
ause of dif- 


ILLUSTRATIVE EXAMPLE ОЕ 
PRINCIPAL-FACTOR SOLUTION 


The principal-factor solution will be illustrated using data originally gath- 
ered by Holzing i 


zer and Swineford (1939). These data are reproduced from 


Harman (1967). 
Twenty-four psychological tests were 
children in grades seven and eight in a suburb of Chi 


gical tests, together with means, standard deviatio 
cients, are shown in Table 27.3. i 


psychological tests are shown in Table 27.4. 

Note that correlation coefficients below the main diagonal are shown. 
The table is, of course, symmetric about the main diagonal, 

The data of Table 27.4 were analyzed using the principal-factor solution. 
In applying this method squared multiple correlations (SMCs) have been 


Table 27.3 Basic statistics for twenty-four psychological tests* 


oe Е Standard Reliability 
х, x, deviation coefficient 
5, ги 7 

1. Visual Perception 29.60 6.90 756 
2. Cubes 24.84 4.50 “568 
3. Paper Form Board 15.65 3.07 ou 
4. Flags 36.31 8.38 ЕЕ; 
5. General Information 44.92 11.75 Me 
6. Paragraph Comprehension 9.95 3.36 ‘el 
7. Sentence Completion 18.79 4.63 aa 
8. Word Classification 28.18 5.34 dnd 
9. Word Meaning 17.24 1.89 P 
10. Addition 90.16 23.60 .952 
11. Code 68.41 16.84 1719 
12. Counting Dots 109.83 21.04 1937 
13. Straight-Curved Capitals 191.81 37.03 .889 
14. Word Recognition 176.14 10.72 .648 
15. Number Recognition 89.45 7.57 .507 
16. Figure Recognition 103.43 6.74 600 
17. Object-Number 7.15 4.57 1725 
18. Number-Figure 9.44 4.49 610 
19. Figure-Word 15.24 3.58 .569 
20. Deduction 30.38 19.76 .649 

21. Numerical Puzzles 14.46 4.82 184 2 
22. Problem Reasoning 27.73 9.77 1787 
93. Series Completion 18.82 9.35 .931 
24. Arithmetic Problems 25.83 4.70 .836 


d from 


* Reprod 


1967, with kind perm! 


Harry H. Harman's Modern factor analysis, 2d ed., Unive: 


ission of the author and publisher. 


used as estimates of the communalities. These are the squared multiple. 


correlations of each variable w 


shows 


have been extracted. The body of the table show: 
n the five factors. The rows at the bottom of the table show the 


mmunality. The contribution of | 
s simply the sum of the squares of the factor loadings. The per s 
buted by each factor is also shown. To 


the 24 tests о 


contribution of each factor to the total co 


each factor і 


cent of the total communality contri! 


the right of Ta 


with the communalities obtained follow 
ctor loadings for the individual rows. 

should note that the contribution of the factors decreases 
fifth factor. The first factor accounts for 64.2 per cent of 


squares of the fa 


The reader 
from the first t 


ith Шел — 1 remaining variables. Table 27.5 
the principal-factor solution obtained for these data. Five factors 


s the factor loadings for 


ble 27.5 the original SMC communalities are shown together 


o the 


ing the calculation by summing the 


rw 


AN INTRODUCTION TO FACTOR ANALYSIS 417 = 


ity of Chicago Press, Chicago, 
б 


| 


К 


we бор 


А 
8 
= 
в 


= 
3 
9 
е 
“ 
а 
а 
2 
= 


© 
3 


ДЕК КОЛЛЕ 


= 
я 


ЕЕ 


И 11 0188888 


2 
% 
3 
И Li (1888488393883 


ГІГІТІІ11Е85445555533925 


ГЕРКПЕЛЕТКТЕПГІЛШІЛІЗІТІ 
| E UEDDETPI-ELULENP DIIS 
ELE ELE EE E УЕ 


Ж л АЙ ELEGIR 
ШО КЫЕН Г 
b VOR ИЧИ T 


ООВ Е 


EA UST a 


ie Mtb An ist lM A a a a 
Ir bete Tele Ute АГ 


vz £c 25 12 05 6L Li 11 9t SI FL ЕТ гт m о 6 8 2 9 5 * Е ONES: Гөр 
+Чәлричцә ері +9} $1591 [porzo[ouossd лпоу-Х1иәм1 yo suonv[aouro2423ug 


Vis Iq, 


ЕЯ ТРИЕСТ РУЗ даа. жылымды К 


AN INTRODUCTION TO FACTOR ANALYSIS 419 


| Table 27.5 Principal-factor solution for twenty-four psychological tests (communality es- 
timates: SMCs)* 


Common factors Communality 


Test P, P. Р; Р, P, Original Calculated 

ee сенен талан eee 
1 595 —.039 .369 —.184 —.073 51 531 
2 374 026 .270 —.147 .121 300 250 
3 433 115 „396 -Л12 —.276 440 446 
4 501 108  .290 —.178 .044 409 380 
5 701 312 -:213 —.050 .003 673 666 
6 683 404 —.213 .067 —.102 677 690 
7 676 412 —.284 —.082 —.046 684 716 
8 680 206 —.088 —.115 —.118 564 540 
9 690 446 —.212 076 .036 713 727 
10 456 —.469 —.446 —.128 105 579 654 
11 589 -.372 —.198 .076 —.185 541 565 
12 448 -.491 —.154 —.263 .04 537 537 
п 13 500 —.268 .018 -.300 -.255 539 575 
14 435 -.063 —.012 .418 —.057 358 371 
15 390  —.102 .055 .362 .101 293 307 
16 512  —.098  .325 .259 .006 429 444 
17 471 -.212 —.036 .388 —.087 412 426 
18 51 —.331 .118 .145 .028 443 417 
19 450 —.115 110 167 —.178 367 287 
20 623 135 0142  .049  .252 464 492 
21 596 -.220 .076 —.140 .204 413 471 
22 .600 103  .138  .053  .142 449 413 
23 .685 063 160 —.096 4154 .561 .532 

24 .635  —.169 —.192 -.009 .079 .527 25 


7.665 1.672 1.208  .920 .447 11.943 11.912 - 


V, 
5 10.1 77 3.7 100. 99.7 


100/,/11.943 64.2 14.0 


* Reproduced from Harry Н. Harman 


Chicago, 1967. with kind permission of the author and publisher. 


the total communality, the second for ae v S e Pr fifth for onl. у 
3.7 рег cent. Note also that all loadings or the first factor are positive, 
whereas subsequent factors contain both positive and negative loadings. 

The data of Table 27.5 are not amenable to any psychological interpreta- 
tion. Before any meaning can be assigned to factors, the factors obtained 
bs tlie principal-factor solution must be subject to further analysis. 


420 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


27.10 THE CONCEPT OF STRUCTURE 


The factors obtained by the principal-factor or other direct solutions are 
not ordinarily viewed as amenable to interpretation. A rotation of the refer- 
ence axes is made to a new position and derived factors are obtained. 
These can be assigned meaning. The need for the rotation of the reference 
axes has not always been fully understood although the reasons for this are 
quite straightforward. 

As discussed in Sec. 27.6 a table of correlations can be represented geo- 
metrically as a configuration of vectors. It is possible to conceptualize a 
random configuration of vectors, that is, a configuration where the vectors 
are arranged at random in relation to each other. Such a configuration is 
represented in Fig. 27.4a. Just аз a random arrangement of points in a 
space of two dimensions has no structure or organization, similarly a ran- 
dom arrangement of vectors т а space of m dimensions has no structure or 
organization. The configurations of vectors that correspond to most tables 


1 


и 1 


| 
р 
(e) (a) 


Fig. 27.4 Configurations of vectors with and without reference axes, 


7 


27.11 


AN INTRODUCTION ТО FACTOR ANALYSIS 421 


Р не not random, but they have certain structural properties. 
г ig. 27. һе vectors are clustered in two groups of four vect E 
the two clusters being roughly orthogonal to each other. Th: es 
of vectors is not random. It has structure. Here structure Ph ү 

" ization, 


is viewed as progression from 
a state of randomne 
ss, and random 
ness as 


the obverse regression. The configuration of vectors in Fig. 27.4. 
described by inserting reference axes in an indefinitel jus "as 
possible locations. One such location, obtained т b ew 3 
factor solution, is shown in Fig. 27.4c. While this jd E ba 
description of the terminal points of the vectors, intuitivel ird рона 
municate readily to our understanding the structural pro йе Же ue 
figuration. By rotating the reference axes to the posu Sos e Ee 
97.44, a factor matrix is obtained which clearly shows the dem ad 
configuration. In Fig. 27.4d variables 1, 2, 3, and 4 have hi hl di т E 
factor I and small or zero loadings in factor П. Similarly es A 
, 6, 75 


and 8 have high loadings in factor II and small or zero loadings in factor I. 
Ld ctor 1. 


Most observers would agree that intuitively the reference axes in Fig. 27.4d 
provide a more meaningful description of the confi i ЖЕ). 
му guration than those in 
The purpose of factor analysis is to discover, and satisfactorily describ 
scribe, 


the structural properties of the matrix of correlations, or the cor i 
configuration of vectors. This purpose is, it is hoped, achieved ние 
of the reference axes. The reader should note that the new Tecate We 
reference axes is determined, or compelled, by the structural н | E 
the configuration. The situation here is directly andes: w Sub 3 
a 


straight line to a set of points by the method of least squares. The preci 
. cise 


location of the line, an 
line, is determined by the structural properties of the set of points 


For many years the rotation of axes was done using laborious graphical 
methods. These methods have been replaced by analytical methods vu h 
maximize or minimize certain statistical criteria. ІЗ 


ALYTICAL METHODS OF ROTATION 

Analytical methods of rotation have been developed by А 
у Carrol (1 

Ferguson (1954), Neuhaus and Wrigley (1954), Saunders (1953), a Ke 

(1958). All of these methods involve, with modification, essentially the 

same idea. This idea, as stated by Ferguson (1954), is as follows: 


AN 


plotted in relation to two orthogonal reference axes. These axes may be го: 
Iv large number of positions, yielding, thereby, an indefinitely lar a 
location of the point. Which position provides the most e 


nost parsimonious description will result 


Consider a point p, 
tated into an indefinite 
number of descriptions f the 


? Intuitively it appears that the m 
through the point, the point being, thereby, described аз 


Likewise as one or other of the reference axes is rotated 


nious description 
when one or other of the axes passes 


a distance measured from an origin. 
in the direction of the point the product of the two coordinates grows smaller. This product i 
я etis 


a minimum and equal to zero when one of the axes passes through the point, and is a 
id max- 


d the fact that it is a straight line rather than a curved - 


| 422 


[27.15] 


Fig. 27.5 


№ - ж- 
PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


imum when a line from the origin and passing through the point is at a 45° angle to both axes. 
These circumstances suggest that some function of the product of the two coordinates might 
be used as a measure of the amount of parsimony associated with the description of the point. 
Note, further, that this product is an area, and is a measure of the dep: 
the reference frame. Consider now any set of points with both positi 
nates, the usual factor case. 


arture of the point from 
ve and negative coordi- 
Here the sum of products is inappropriate because a zero sum 
can result when the sums of negative and positive products are equal. Here following the 


usual statistical convention we may use the sum of squares of products of coordinates, or 
some related quantity, as a measure of parsimony. 


The quantity under consideration for two factors and four tests is illus- 
trated in Fig. 27.5. Here a measure of the departure of the terminal points 


of the vectors from the reference frame is the sum of the squares of the 
areas 


(anan)? + (azan)? + (anap)? + (ayap)? 


For m factors and n variables we consider the m(m 


— 1)/2 sum of products 
of pairs of coordinates, which may be written as 


m 


n 
У, (amay)? 
Р<4%1 jel 


The ay and a;, are unrotated factor loa 
denote factor loadings following rotati 
one of locating the reference axes i 


dings. If by, and by are now taken to 
оп, the problem of rotation becomes 
n such a position that the quantity 


422 


Departure of points from reference frame, 
squares of the areas анау» a. 


The sum of 
таз ass апа анаша a 
measure of the departure of points from the reference 
frame. 


^3 


шай " - ы жга: ға "еу Шол қола Аы 


27161 Y У, Orba)? ; 


[27.17] 


[27.18] 


[27.19] 


AN INTRODUCTION TO FACTOR ANALYSIS mm 


р<а=1 ізі 


is ini i i 
a minimum. By some simple algebra it can be shown that minimizi i 
quantity is the same thing as maximizing M 


Qc Ў 5 by 
рті ізі 


Thus the problem becomes one of locati Б 
such that the sum of all factor Е о aa "-—— 
fourth power is a maximum. The statistic Q is a measure ee ‘ie F ha 
a measure of the departure of the terminal points of the е ЈЕ is^ 
reference frame. It is a measure in the least square sense x aS топа 
of fit of the reference frame to the configuration, One difficult We | 
criterion relates to the weighting of variables. This ЕЯ » ЖШ 0) 
variable in a manner proportional to the square of its ны ваш 
means that a variable with а communality of .80 will exert fim ti ia 
weight in determining the final solution that a test with a с т tlie 
40 will exert. A possible solution here is to use еса аана о 
ings, that is, factor loadings adjusted in such a way that the pos { d 
squares for each variable, the communality, is equal to 1. In th 2 their 
cal model this amounts to extending all vectors to unit length Tir EN 
of normalized-factor loadings assigns equal weight to each of the - Бе 
Allanalytical methods of rotation in common use involve, with E 
tion, the essential ideas described above. Probably the method moat wide 
accepted at present is a modification proposed by Kaiser (1958), wl " 
led to consider the simplicity of a factor matrix. He defined the оа 
of any factor as the variance of its squared factor loadings and the ae 
ity of the factor matrix as the sum of these variances over all factors Th 
ch is known as the raw varimax criterion, is as Pe 


quantity, whi 


„шека Ей luo fw. M 
Е, a te 5—8 (б) 
р=1 4-1 pei 9-1 


erion weights each variable in a manner proportional to the square 
Since empirically this weighting system did not yield a 
Kaiser proposed a “normal” varimax criterion which 


This crit 
of the communality. 
satisfactory result, 


may be written as follows: 
m. n m n 2 
yan $ $ (byl) - S (X ihi) 
pci el рті ‘j=1 


ized prior to rotation. All communalities are 1 
h, and all variables receive equal weight The 
nly used in factor analytic studies at the 

+ 


Here all factors are normal 
all vectors are of unit lengt 
normal varimax criterion is commo 


present time. 


424 


27.12 


Table 27.6 


bl 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


ILLUSTRATIVE EXAMPLE QF THE NORMAL 
VARIMAX METHOD OF ROTATION 
The normal varimax method of rotation will be illustrated with reference to 
the principal-factor solution for 24 psychological tests as shown in Table 
27.5. The first four factors of Table 27.5 were rotated in order to maximize 
the normal varimax criterion of equation [27.19]. The fifth factor of Table 
27.5 was excluded from this analysis because of its very small contribution 
to the total communality. 

Table 27.6 shows the rotated factors. All factors greater than .400 are 


Normal varimax solution for twenty-four psychological tests 


Verbal Speed Deduction Memory Calculated 


Test M, M. м, M, communality 
КЕ c ЧЕЙ 

1 1152 .204 .658 165 .526 

2 an 089 458 069 235 

3 144 —.013 579 118 370 

4 .231 090 1557 080 378 

5 145 220 .199 149 666 

6 764 078 199 225 680 

Я .803 155 .196 084 714 

8 577 233 345 187 526 

9 .195 048 203 224 126 


Ed cip ku ы 
оне 
pai T 
a XO 

etn 
unu 
MONNA 
MR MM 

1 

ы райы m o 
> 
изе ф 
х о оо 
о о о ~ 
чо о л 
ооа - 
ыы 
ou ow 


14 207 081 060 561 368 
15 125 .082 105 513 .297 
16 073 -059 414 514 444 
17 142 .222 061 588 419 
18 026 .356 285 455 .416 
19 130 168 250 385 .256 
20 384 100 426 300 -429 
21 171 437 495 212 429 
22 351 .113 407 301 .392 
23 368 .225 521 225 .508 
24 351 479 176 292 -469 
V, 3.613 2.616 2.934 2.300 11.462 
1007,/11.462 31.522 22.823 25.598 20.066 


EXERCISES 


' к E 
AN INTRODUCTION TO FACTOR ANALYSIS a25. 


shown in boldface type. The contributions 
m at the bottom of the Vm e =. ve m 
e four factors, instead of the original fi malities ы 
e ve, are shown to the right 
The rotated factors of Table 27.6 are ame 3 
tation. Inspection of the tests used in this ш pus 
= eee factor. Tests 5, 6, 7, 8, and 9 have high [d mig pis са 
Я зе tests are very largely tests of verbal ability. I : a M 
tests suggests that the second factor is a speed facto; Ime Жезде e 
highest loadings in this factor are tests 10, 11, 12, and 18 Th. ree hen 
a deduction factor, and the tests loading highest an this f thin Г 
4, and 23. The fourth factor has its highest loadings in eal rier 
and 18, and may be identified as a memory factor. eon ee 


If there is one general factor only, the general factor loading fi 
‘or vari- 


1 
able i may be calculated by the formula: 
Til ik 
a Ul ik 
Гк 
This formula is seen to be directly analogous to formula [27.2 р, 
Calculate the (а) general and (b) specific factor loadings from th У: n 4 
lowing table of intercorrelations. e 10 
1 2 3 4 5 t 
1 i 
2 172 
3 6 6 : 
4 16 48 М К, 
5 18 118 14 04 
Tao 


2 Two variables, 21 and 2, may be expressed as linear functions of 


weighted factors as follows: 


— дор, + .60F, — -05Fs + -10F4 + .Л8й, 
z= .05Ё + ‘тор. — -10F, + -60Fs << +370 


Calculate the correlation between zi 


21 


and zs. 


3 The following are factor loadings on tw 


PSYCHOLOGICAL TEST AND MULTIVARIATE STATISTICS 


o-factors for six variables: 


Variable F, F, 


An pwn н 
Бла Мм б о 
DAD w %® & 


Calculate (а) the communalities, (5) the uniqueness for each variable, 
(c) the contribution of each factor to the total communality, (d) the 
per cent contribution of each factor to the total communality, 


For the data of Exercise 3 above, 


write down the table of reproduced 
correlation coefficients, 


The reliability coefficient for variable 6 in Exercise 3 is .80. For this 
variable calculate that part of the variance that is error variance. 


6 In a normalized configuration the vectors are of unit length. Norma- 
lized factor loadings are obtained by dividing the factor loadings for any 
variable by the square root of the communality, Calculate the norma- 
lized factor loadings for the data of Exercise 3. 

7 The loadings for a particular variable on two factors, Е, and F,, are, re- 
spectively, .3 and .5. These loadings may be plotted as a point with ref- 
erence to two orthogonal reference axes, Calculate the distance of the 
point from the origin. 

8 The following are loadings for six variables on two factors: 

Variable F, F, 

Ae AN 
1 2 2 
2 3 3 
3 EI 4 
4 2 E 
5 3 05) 

6 4 -.4 

SS йу 


Rotate these factors to what a 


рреагз to be the most 
tion. Calculate the rotated fa 


parsimonious posi- 
ctor loadings. 


ANSWERS TO EXERCISES 


CHAPTER 1 
2 а. ratio Ь. ratio c. ordinal 4. nominal 
e. nominal Т. interval g. ratio | h. nominal 
i interval j. ordinal 


as N п 
за РИ УТЭУ 
i [1 det 


ia 
x N 
a Sane DAM ot LE 


іт ізі 


Sx, һізж i ao, 
gc eum іс 
$x, oii a 


iet 


4 Хы, BAN tA TX 
e (Х Y) + (Xe + Y) (4% Yq) + (Xa + Ya) + (Xs + Ys) 
d. Хау, + X Ys + Xa Ys 
4e) + Qt e) i 


e. (X, +e) + (А. + с) + (№ 
Е. Xile + Xsle + Xale + Xde + 


N N 
42Xc4 e) =Y XP 2e У Kit Net 
der 


Хс + сҮ, + сї, + сҮ, + cY, + сҮ, 


ігі 


N N 
s ў + б? 
det 


iei 


6 а. 139 b. 120 е. 14 d. 40 e. 86 í 
f. —14 в. 1955 h. 94 i. 11.300 s 
d. true 


7 a. false b. false 6. true 


8 5,050 
' 


9 a.30 b. 560 
10 a.24 b. Nc с. 2N 


M қ ый OW 


428 ANSWERS TO EXERCISES 
CHAPTER 2 
l Class Interval Frequency Cumulative Frequency 
95-99 2 40 
90-94 2 38 
85-89 4 36 
80-84 6 32 
75-79 5 26 
70-74 4 21 
65-69 5 17 
60-64 2 12 
55-59 2 10 
50-54 4 8 
45-49 1 4 
40-44 2 3 
35-39 1 1 
Total 40 


2 Interval Exact Limits Mid-points 


95-99 94.5-99.5 97 
90-94 89.5-94.5 92 
85-89 84.5-89.5 87 
80-84 79.5-84.5 82 
75-79 74.5-79.5 77 
70-74 69.5-74.5 72 
65-69 64.5-69.5 67 
60-64 59.5-64.5 62 
55-59 54.5-59.5 57 
50-54 49.5-54.5 52 
45-49 44.5-49.5 47 
40-44 39.5-44.5 42 
35-39 34.5-39.5 37 
3 Class Interval Exac 


а. 85-89 84.5-89.5 87 
80-84 19.5-84.5 82 
75-19 74.5-79.5 77 
etc, ес. etc. 
b. 135-137 134.5-137.5 136.0 
132-134 131.5-134.5 133.0 
129-131 128.5-131.5 130.0 
etc. etc. etc. 
с. 850-899 849.5-899.5 874.50 
800-849 799.5-849.5 824.50 
750-799 749.5-799.5 774.50 
ete. ete. etc. 


d. .800-.849 .7995-.8495 .8245 
.750-.799 .7495-.7995 -7745 
-700-.749 .6995-.7495 -7245 

ес. etc. etc. 


ANSWERS ТО EXERCISES 429 


Cum i 
4 Class Interval Frequency Cumulative Frequency % TA 
17-18 3 50 100 
15-16 7 47 94 
13-14 1 40 80 
11-12 5 39 78 
9-10 6 34 68 
7-8 6 28 56 
5-6 8 22 44 
3— 8 14 28 
1-2 6 12 
8 The distribution of intelligence quotients in a random sampli i 
would ordinarily be approximately symmetrical with a saree үкү т 
neighborhood of 100. The distribution of intelligence quotients in a sample of ines t 
students would in most cases be positively skewed, with a measure of central "ee 
substantially greater than 100, possibly 115 or 120. Very few university students vente 


be expected to have intelligence quotients below 100. 
Marks on mathematics examinations are not infrequently more variable than marks on 


history examinations, a higher proportion of individuals making very high and уегу lo 


marks. 


CHAPTER 3 
1 3.46 


2 66.58 
If a constant c is added to all observations in a sample with mean X, the new mean is 


X + c. Multiplying by с results in a mean cA. 
а. 52.00 b. 40.00 

1,225 

a.15 b. 33.5 
a 55 b.47 


Ex.1 3.55, 4.00 
Ex. 2 67.36, 67.00 


с. 325 d.24 е. 5.0 


e ча с алы 


CHAPTER 4 


pi? |Б. ЖОҒ 45 39.20 4. 6.26 


1,485 
12.50 

а. 500 b..80 

но aia 


а. 67.78, 8.23 b. 477.63, 21.85 


—1.2, 1.3, 2.8 


m; = 20.80, ms = 0, ms 


e6 o -20u70 ^» о мә 


ANSWERS ТО EXERCISES 


CHAPTER 5 
Iw 


әхіхіхіхіхі 
8х7хбхохахзЗх2 х 1 = 40.320 
12 


о еж 3295 n»wWw 
х 
B 
x 
ge 
x 
+ 


= 
© 


i 

20 

60 

тіз 

20 minutes 

Tii 

(D^, 1005 

The expected distribution of heads in tossing six coins 64 times is 


Н 10° 1. 34 43.4555: 6 3 
b E 30: 75.20. "19:66; 51 


Bee ee ee - 
паи Ае мю н 


18 The expected distribution of 6's in rolling six dice 64 times is: 


6s f 

0 15,625/46,656 X 64 

l  18,750/46,656 x 64 

2 9,315/46,656 X 64 

3 2,500/46,656 х 64 

4 375/46,656 x 64 

5 30/46,656 x 64 

6 1/46,656 x 64 ` 
19 598/4,096 


20 7/64 х 31/46,656 


CHAPTER 6 

1 .0395, .1238, .3980, .2444, .0088 

2 3.50, 25.90, 79.40, 62.46, 34.28 

3 а, 4319 b. .3962 е. 4013 а. .0668 
f. .8289 в. .3830 h. .7066 1. 1331 
4 а. .675 b. 1.281 c. 1.281 d. +1.281 


e. .1038 
j. .0025 


5 а, .0099 b. .0918 е. .2514 а. .9050 
6 68 


7 An error $ i 
ре score of 66 is more than four standard deviations above the mean. Th 
e of a score equal to or greater than 66 is very improbable in rand. n; е occur- 
lom samples. 


8 Grade Score Interval 


A 62.8- 
B 55.2-62.8 
| с 44.8-55.2 
D 37.2-44.8 
Е -37.2 


9 a. 79 b. 201 
D 


CHAPTER 7 
1 а, The correlation between the intelligence of parei i 

frequently been reported to be in the оваа offspring haa 

Fairly high positive in samples of individuals covering a broad age Kad Was с 

age of 17. d. Fairly high positive. e. Fairly high positive under NEU d: had 

nomic conditions. f. Probably about zero, although data have been re A end SN 

trary. в. Probably about zero, although arguments may be advanced (s aed к 1 


ative correlation. 


2 —.063 


ғ : 
sso mph Уят 


Hence by substitution 
{ Уху A Xxy 
(N — 1) 5;5у Зах" 


а +1апа —1 


5. The paired measurements тау be in exactly the same order, but the value of z, may 
і be exactly equal to the value of z, with which it is paired. Consequently Se ESI 
necessarily be equal to Уг, and Xs Under these circumstances Хг „2, will be EA th in 


either Хг,2 or Ха, and r will be less than 1. 


[ 
6 544 


57 The mean of X Yia X. Y. Hence we may.wnte 


1 
á  sxix-n-d-Dr.xa-D-0-DF ) 
get N-1 D 


N-1 


| 
2-Х + sv- z3x-30 -Y) 
i = N N-1 New 


= së + sy? my * 
The variance of sums is equal to the variance of the differences when the variables are 
x 


uncorrelated. 


432 


ANSWERS ТО EXERCISES 


CHAPTER $ 


л жо ы 


6 


b. bxy=1.200 е. byy = .545 
е. X' = 1.2007 + .400 


b. У’ = .482Х + 38.360 с, 72.151 а. 103.033 


—1.00, —.84, .09, .58, .866 
Ү' = ЛХ + 5.593, Х' = 2.1007 + 14.050 
From [8.8], 


Hence гуу = Уба 


This statement is correct when the regression lines are not linear, Despite the possibly 
Poor fit of the linear regression model, the fact remains that 50 per cent of the variance 
of one variable is Predictable from the other variable. It is of course the case that, had a 
nonlinear Tegression model been used, more than 50 рег cent of the variance 
able might have been Predictable from the other variable. 

а. 300 b. 100 


.866 
+9.80 


of one vari- 


CHAPTER 9 


l a. A random sample is suc 


having 
e the members in 
ing included in the sub- 


he proporti 


с. Ап experimental sampling 
drawing of samples from a Population. [t is the 
thus obtained, A theoretical sampling distribution, on the 


а theoretical method based 9n probability co 
drawn, 


S а proportional 
distribution is obtained by the actual 
e values of the Statistic 
other hand, is obtained using 
onsiderations, No samples are actually 


A variety of methods Suggest themselves, Su 
directory, Numbers 


igned numbers, using a 
may be used. 


» ora table of random numbers 


Student population may for example, faculty, Let us Suppose that a 
university has 5.000 stud, i i 


n forming а sample of 100, 50 
contains 2,500 
th name may be 
th name may be 
stratified sample. 


or faculty index, 
1250 stude, 
ing sample 


of students, every fiftie 
nts. Here again every fiftie 
isa Systematic Proportional 
A random sample of names sel 
an appropriate sample of votin; 
income, and possibly other vai 
political predilections. Bias mi, 
in 1966 than, say, in 1931. 


lected from а telephone directory may not be 
£ behavior, Owning telephone may be c, 
riables, which may in turn be correlated 
ight result. Such a sample would clearly е 


considered 
orrelated with 
with Particular 
xhibit less bias 


P 


ANSWERS TO EXERCISES 
b. 1.12 е. .0179 : 


With replacement = 8.00. Without replacement = 7.53. 


a. 200 b. 400 
в. р Г b.o,—.1637 


50 15 
75 40 
100 15 


e 0 3 a 


10 Inerease 


CHAPTER 10 


1 


95: 103.04-106.96 
99:102.42-107.58 
75: 103.85-106.15 
85: 103.56-106.44 


95:48.24-51.76 
99:47.68-52.32 


By tripling sample size the standard error is multiplied by 5774. or divided by Ут 


95% 99%, 
а. 23.55-28.85 22.71-29.69 
b. 55.35-61.25 54.42-62.18 
с. 43.18-49.42 42.19-50.41 
d. 5.73-11.07 4.89-11.91 
a. 2.086 b. —3.850 е. 1.725 d. +1.725 
а. 2.571 b. —6.859 е. 2.015 4. +2.015 


а. .005 Ь. .050 с. .999 а. .015 е. .965 


ANSWERS ТО EXERCISES 
8 Number 


очо сле оюн о 


9 .685-.815, .664—.836 


» СНАРТЕК 11 
, l 1=1.74; p > .05 


2 (1-162; p > .05 

З t=2.81;p < .05 

4 ¢=1.22 

5 (= 2.96 

6 Groups are ordinarily matched 
correlation between the values 


standard error of the differen 
would have been obtained fo 


to reduce sampling error, If maiching leads to a positive 
of the independent variables for the matched pairs, the 
ce between means, for example, will be less than that which 
r independent samples without matching. 


. 7 Cochran & Cox: tos = 2.17; р > .05 
Welch: tos = 2.05, df= 27; p > .05 


CHAPTER 12 
1 z= 3.086; р < 01 


2 == 1414; p > .05 

3 2=1.704; p > .05 

4 z= 3.620; р < .001 

5 F-1.363: p > 10 
6 t=1.350; p> -05 | 
7 а. .867 b. .050 е. —.693 


d. —2.647 


8 2g-144 11.989 1-239 gpa 


9 :—2065 p < 05 ] 


2.450 


1= 3.222; p < .01 


. 


2 Consider a simple experimental situation involvin 


3 In the above question matching of ex 


CHAPTER 13 " 
1 x*—1133. df — 5; p < .05 > .02 4 
2 Class Intervel Observed Frequency Theoretical Frequency 
90-99 1) 6 
80-89 5 8 
70-79 17 17 
60-69 30 31 
50-59 50 39 
40-49 35 34 i 
30-39 10 20 x 
20-29 6 8 
10-19 4 
6 : 
0-9 2 a 
160 160 
х2 = 9.498, df= 5; p > .05 > Г 


8 &1 b2 е8 
4 х:=3.163: p > .05 
5 2 = 14.287; p < .001 р 
6 х-5.920р < .02 

a. ҳ2= .633;р > .05 b. №=5.206; p > .05 


CHAPTER 14 T 
1 a. In the simplest type of experiment two variables are involved, an independent and 
a dependent variable. The value of the dependent variables is thought to depend in 
some way on the value of the independent variable. To illustrate, an experiment might be | 
conducted to investigate the effects of different amounts of practice on intelligence test - 
items on intelligence test performance. Here intelligence test performance is the depen: 
dent variable and amount of practice constitutes the independent variable. In an experi- 
ment designed to investigate the relation between size of lesion in a particular brain locus - 
and maze performance, measures of maze performance constitute the dependent variable | 
and size of lesion the independent variable. The investigator wishes to ascertain how maze. 
performance depends on lesion size. b. A treatment variable is in effect the creation. 
of the investigator, who decides the particular values which the variable will assume and 
the frequency of occurrence of these values. Different methods of learning a language, 
different dosages of a drug, and different amounts of practice in learning a task are 
examples of treatment variables. А classification variable, on the other hand, is one. 
whose values, and the frequency of their occurrence, are not within the control of the 
investigator but exist, as it were, in nature. Height, weight, College Board test «кеј 
and the like, are examples of classification variables. К 

g an experimental and a control group. 
Subjects in the two groups are matched on a variable X; consequently the two groups аге | 
equivalent with respect to X. If X is positively correlated with У, a possibility exists that- 
the paired values of Y for the experimental and control groups will also be positively 
correlated. If this is so, the sampling variance of the difference between means, and other - 
statistics, will be less than had the two groups been independent. А 
perimental subjects оп а single variable X was | 
have a single attribute which may be correlated 
utes. Randomization tends to ensure that | 
р 


discussed. Clearly the subjects do not 
with У, but possibly a large number of such attrib 


436 


ANSWERS ТО EXERCISES 
groups of subjects will not differ signi 
Randomization also tends to е 
on a large number of attribut 


ificantly on a large number of such attributes. 
sure that groups of subjects will not differ significantly 
s. some of which may be correlated with У. 


One very simple procedure is to number subjects 


slips of paper, and draw the slips of paper from a 
su 


from 1 to 100, write the numbers on 


well-shuffled container, allocating a 
ject in turn to each of the five groups. Tables of random numbers could be used. 


Subjects may be 
N # K*, in this cas 


signed to treatments at random, as 


described in Sec, 14. 
€ no systematic arrangement of the 


Latin squ 


Since 
are type is possible. 
Unless either equal or 
each other, When the g 
about the effect of eac| 
independently of, the о 


proportional groups are used, the factors 
roups are either equal 
h of the factors on the 
ther factors used in th 


are not independent of. 
от proportional. conclusions can be reached 
dependent variable, quite apart from, and 
е experiment. 

А randomized block "riment is one in which 
subgroups or block: Subjects may be assignes 
being that each treatme! 


а basis exists for arranging 
d at random to tre: 
nt oceurs once only in each block. 


subjects into 
atments, a restriction 


AS B. № p 
ВС D) E А 
C D E A в 
DE A в 'G 
E * B c D 


See Sec. 1.8. 


CHAPTER 15 


1 
2 


3 


4 


a 


6 


а. df, =1, а= 38 b. df =3, df, = 52 €. df, =3, df, = 51 
F-32331, df =4, йу, = 20; p < 0) 


F=2.06, df, =3, uf, = 


2l; p > .05 
Е=2.21, df, = 2, df, — 27;р> .05 
502 = 59.36, 54^ = 68.49, F = 0.87, df, — 1, de=; p> -05 


The assumption of normality, homoge 


neity of variance, and additivity 
Е= 3.13, df, = 2, df, 


=12; p > 05 


CHAPTER 16 


1 


ANSWERS TO EXERCISES 


5 башт Sam Smar af Variance Емішше 
Row 754.08 ЕТІ 68.55 
Column 142.17 2 71.09 
Interaction 598.50 2 27.20 
Total 1494.15 35 


F = 2.61, df, = 2, df. = 22; р > .05 
3 Е, =1.23;р > .05 Е„= .23; р > .05 Е = .31; р > .05 
4 F,=3.85; р > .05 Е, = 8.19; р < .01 Е = .08; р > .05 


CHAPTER 17 


1 Source df 
R 4 
[^ 3 
L 2 
RxC 12 
RxL 8 
CxL 6 
RxCXL 24 
Within cells 540 

Total 599 

2 Souree 55 
RxC 562.50 
RXL .00 
CxL .00 
RxCXL .00 

з a. RXCXLinteraction b. К XC interaction е. within cells 

А 4 Mean 
Source 55 af Square F 
' Rows 12.00 1 12.00 21 

Columns 216.75 1 216.75 3.80 
Layers 1800.75 1 1800.75 31.57" 
RxC 280.33 1 280.33 492" 
RxL 8.33 1 8.33 45 
CXL 114.08 1 114.08 2.00 
RxCXL 208.33 1 208.33 3.65 
Within cells 2281.30 40 57.03 


* Significant at the .05 level. 


RxCxL 101.62 
Within cells — 4507.65 
* Significant at the .01 level. 


5 Mean 
Source 55 df Square Р 
Row 1763.57 2 881.78 28.17* 
Column 91.01 2 45.51 1.45 
Layer 3.56 1 3.56 П 
RxC 162.32 4 40.58 1.30 
RxL 36.78 2 ` 18.39 59 
CxL 2507.70 2 1253.85  40.06* 
4 
44 


38 


6 Sum of Mean 
Source Squares df Square Е 
Subjects 258.69 3 86.23 
Rows 105.06 1 105.06 1.42 (5К) 
Columns 189.06 1 189.06 26.15* (SC) 
SXR 220.69 3 73.56 
5хС 21.69 3 7.23 
RxC 68.06 a 68.06 16.10" (SRC) 
SxRxC 12.69 3 4.23 
Note: Error term shown in parentheses following F ratio. 
* Significant at the .05 level. 
CHAPTER 18 
1 ¢=4.23, df=20; p < 01 
2 Comparison Е 
LII .006 
„ш 19.91 " 
m 6501. дед 
Lv 14.71 d = QE 
п.ш 20.61 Significant compari 
parison (p < .01) 
m M 1927 LH; LIV; II, Hl; 
d i IL IV; IV, V 
Il, IV 12.97 
I, v .39 
IVY 17.87 
3 Table of Q 
АНУ ҮІ; ІШ wv 
п И 5.535* 6.42 11.513* 
1 5.424* 6.310" 5.978* 
у -886 5.978* 
n 5.092 
0з= 4.02, Q; = 1.61, Q, = 5.02, Q; = 5.35 
* Significant at the Ql level, ——————— —— 
4 Comparison Е 
Land II 726 
I and Ш .68 


df=3, d£ = 21 
I and IV 206 Fi = 14.61 
II and Ш 2.14 No comparisons are signific: 
Папа IV 1.04 abl level, poe RUM 
Ш and ТУ 5.90 


СНАРТЕК 19 


ł 


а. 565.35, 275.63, 3.00 2 
Ь. 9.92, 215.63, 3.00 
с. 27.79, .30 


2.62 


ANSWERS TO EXERCISES 


т 


[ Y. 
ANSWERS TO EXERCISES. 


k=6 

су: —5, —3, —1, 1, 3, 5 
Ge 9,71,—4,—4, —1,5 
ср-5, 7, 4,-4.-7,5 


Linear 17.72 
Quadratic 13.64 
Cubic 134.09 
Deviation 79.17 
Within-groups 3.62 
Е, = 4.90 (р < .05), Е, = 3.77 (p > .05), Fs = 37.04 (p <. 


See Sec. 19.10. 


CHAPTER 20 


1 Adjusted total sums of squares on X = 319.13 
Adjusted within-groups sums of squares on X = 128.73 
Adjusted between-groups sums of squares on Х = 190.41 
F = 8.14, df, = 2, df; = 11; p < .01 


а. 1.39 Ь. 72 е. 1.75 
Gp I, 15.14; Gp II, 22.51; Gp Ш, 29.95 
4 А = 125.54, В = 128.73, F = .15; р > -05 


CHAPTER 21 

1 p=.643, p = .05 for one-tailed test 

2 p=.612, p < .05 for one-tailed test 

3 a. р> .05 for one-tailed test b. p< .05 for one-tailed test с. p < .05 for one-t 


test * 
а.5=-3 Ь. 5=—1 с.5-2 4.5--3 е. 5=—1 #5=-3 


а. ой = 28.33 b. ор = 26.33 с. o7=23.07 d. с2= 21.00 е. o = 19.80 
z= .376 z= .000 z= .208 z= 437 z= .000 


f. o, = 16.20 
z= 498 


Pan = 607, Pac = .429, py, = -393, W=.651 
K = .000 
К = .400 


CHAPTER 22 
1 х=5.63, p < .02 > .01 


2 р=.18 
x2 = 13.60, p < .01 
2=2.69, р < .01 
W, = 28, p > .05 
Н = 12.87, р < .01 


i 
ANSWERS TO EXERCISES 


x^ = 15.86, p < .01 
8 z=3.87,p < 01 


Ра = .0186 


For а above, two arrangements would prove to be significant with p < ‚05. For b. one 
arrangement is significant with p < .01 and two with p < .05, 


CHAPTER 23 

1 a. .750, .667, .500, „333, .250 
b. .188, .222, .250, .299, -188 

2 


+087, .316, .447, -190, .776 
118 

+042, .169, .250, +419, .376 
-038, .151, .224, -315, .336 


CHAPTER 24 
1 57-202.50, s? = 22.50; 10 per cent 
2 .462, .667, .837, .889, 980 
3 .968, 928 
4 rey = 833 
5 s#/N= 4.96, s2/N 
6 .707, .849 


= 1.24. The Proportion due to error will be .077. 


7 625 


8 s, (7) = 2.75, $ (26) = 3.57, 5, (44) = 2.32 


CHAPTER 25 


1 The kth percentile point is a value such that А per cent of the members of a sample h: 
less than that value. A percentile rank is a value on a transformed rectangular sı 


which extends from zero to 100. 


2 ар.. = 105.00 b. PR (103) = 24.17 
рь = 11617 РЕ (123) = 65.00 
р» = 125.17 РК (136) = 95.83 


3 а. py =56.32 b. PR (59) = 10.50 


Dao = 65.71 PR (74) = 53.50 
Po = 80.71 РК (82) = 84.25 
4 Class Interval 
45-49 1 
40-44 2: 
35-39 3 
30-34 6 
25-29 8 
20-24 17 
15-19 26 
10-14 и 
5-9 2 
0-4 -0 
76 
5 Stanine Exact Limits of Interval 
9 39.43- 
8 32.53-39.43 
7 26.08-32.53 
6 21.44-26.08 
5 17.85-21.44 
4 15.36-17.85 
3 12.39-15.36 
2 9.97-12.39 
1 - 9.97 


6 8-зсогев: —1.451, —.605, 1.560, —.537, —.767, —.669, —.229, 2.106, —.362, —.292, . 


994 


CHAPTER 26 
1 31.4 per cent 


2 а. 828 Ь. .826 с. 2. = .5452, + 47023 d. х', = .449х, + .580х. — 1.65 е. .83 


За. 2 


e. ЛП а. .734, .642, .638 


nt Frequency Т Score 


.4392, + .2682, — .071z, + 40025 — 29625 
b. x’, = 673х, + .330ху — .081x4 + 6725; — .436% — 61.59 


ANSWERS ТО EXERCISES 


CHAPTER 27 
1 Variable 
йр». 16000 
.4359 
E 
.9798 
.9798 


a. .90, .73, .58, .72, 61, .52 
b. .10, .27, .42, .36, .39, .48 
€. 2.71, 1.35 d. 66.75, 33.25 


2 3 4 5 6 


Fy: .949, .936, .919, -707, .640, .555 
Ex .316, .351, -394, .707, .768, .832 
.5831 


Rotated loadings are: 
Fy: .2828, 4243, .5657, . 


ОЛО .2828, 4213, 5657 


GLOSSARY OF SYMBOLS 


For most commonly used statistics, Roman letters denote sample values and Greek lette 
denote parameters. Exceptions to this are made either for convenience or in conformit: We 
common usage. For example, p denotes both the population value of the Е, T 
relation coefficient and the sample value of Spearman's rank-order correlation coefficien: 
denotes the sample value of the correlation ratio. T, not т, denotes а true measurem: 
А bar above a symbol always indicates the arithmetic mean of a sample of observations 
few symbols are used with double or triple meanings. Homonyms are permissible in the 
guage of mathematics as in any other. i 
Some symbols with idiosyncratic use in a restricted context are not listed. 


a Constant in a regression equation; used with subscripts as ay; and asy; | 
subscript denotes the predicted variable, second the observed variable, Geo 


tercept the Y and X axes. 
b Regression weight applied to an independent variable, or predictor, in ori, 

units; used with subscripts as byz and bzy to distinguish predicted from obs: 

variable. Geometrically, b, and б, are the slopes of regression lines. 


Є (1) A constant. 
(2) Denotes cth column in a set of C columns. è 
с (1) Number of columns. 
(2) Contingency coefficient, measure of association between nominal variables. 
Ce Number of combinations of N things taken r at a time. 
d (1) Difference between paired ranks. * 


(2) Difference. between the mean of a subgroup and the mean of combi 


groups, X, — X-4d. 


D Difference between paired measurements. 

df Degrees of freedom. 

e Base of Napierian logarithms, 2.7183. 

е Sampling or measurement error associated with the ith value. 
E (1) Expected frequency in the calculation of д2. 


(2) The expectation of or expected value, as E(X) or E(X — м). 
Frequency in a distribution; used with subscript to denote interval or subclass, | А. 


fafa Marginal frequencies in bivariate distribution. 
А 


4. a E 


тм 


Tawas Ting 
R 


- ч 
GLOSSARY OF SYMBOLS 


Cell frequency in bivariate distribution, 

Ratio of two sample variances, 

Measure of skewness, 

Measure of kurtosis. 

Size of a class interval, 

Null hypothesis, as in Hy p, — n, = 0. 

Subscripts used to identify part 

(1) Number of subclasses, 

(2) Number of times а test is lengthened, 

Kendall's coefficient of consistence. 

The rth moment about the arithmetic mean, 

(1) Number of observations in a subclass; 
subclass, nj, Dj, пі), ete. Al 
greater than 2. 

(2) Number of test items, 

Number of Observations in a. sample. 

Number of. members in a population, 

Observed frequency in the calculation of xt 

(1) Sample Proportion in the ith class; 


icular observations in a group. 


used with subscript to. indicate 
ways used where the number of subclasses is 


estimate of the probzbility of the occur- 


correlation coefficient. 
Tetrachoric correlation coefficient. 
Reliability coefficient, 

Reliability Coefficient for a half test, 
Reliability coefficient for a test len, 
Partial correlation coefficient. 

(1) Number of rows, 

(2) Sum of ranks; 


gthened А times, 


used with subscript to denote sum 


y. 
(3) Multiple correlation coefficient, 
(1) Standard deviation 


pt to denote variable, s., 5,, 
ete, idi 
(2) Estimate of the Standard error of a statistic; subscript denotes Statistic, 

5, еіс. 


"Sr Spy 
Variance estimate, 


the square of any Standard deviation; used with subse 
indicated under s, 
Sample standard deviation Corrected fo 
Standard error of estimate, 
Standard deviation o 


тіріѕ as 


T grouping error, 


Me? 


ез 


Theiss saga 
A 
^ 
= 


bi 


GLOSSARY OF SYMBOI 

(2) Number of values tied at a particular rank in a set of ranks, — 

(1) True value of an observation or measurement; used with subscript, Т, 

(2) In the analysis of variance denotes the sum of observations; T; is the sum of 
observations in the jth group, ч p 

(3) Correction factor for ties in the calculation of Kendall's tau and coefficie 
concordance; used with subscript to denote variable, Т, Ty. 

Kendall's coefficient of concordance. 

Variable expressed as deviation from the arithmetic mean; subscripts denote par 

ticular values of the variable, xj, y;, ete. \ 

Variable expressed as deviation from arbitrary origin, sometimes involvi 

change in unit, Computation variable. 

Variable in original units; subscripts used to denote particular values of variable 

Xn Yp etc. 

Arithmetic mean of a sample. А bar above a symbol always denotes a sample 

mean. 

Arbitrary origin. 

Ordinate of unit normal curve, 

(1) Variable expressed in standard-score form, z = (X — X)/s,. Subscript used 
denote variable, гу, zy. 

(2) Deviation from origin along base line of normal curve of unit area and unit 
standard deviation, 

Transformation of the correlation coefficient to approximate normal form; usi 

tests of significance оп г. 

Regression weight in a multiple regression equation applied to an independent 

variable or predictor in standard-score form. 

Correlation ratio. 

Population value of a proportion. 

Population mean; used with subscript to indicate variable, и. 

Ratio of the circumference of a circle to the diameter, 3.1416. 

(1) Correlation coefficient in a population. i 

(2) Sample value of the rank-order correlation coefficient. 

Maximum likelihood estimate of p. 

(1) Standard deviation of a population; used with subscript to denote variabl 


Or Ty ete. 


statistics, оу, Ор, etc. 
Maximum likelihood estimate of с. 


below define limits of the summation. 


Kendall's coefficient of rank correlation, tau. E 
Phi coefficient, measure of fourfold point correlation. 


Chi square. 

a is greater than 6. 

a is less than b. n 
a is greater than or equal to 0. \ 
ais less than or equal to 0. 4 
а їз very much greater than b. 

a is very much less than b. 

]s equal to. 

Is not equal to. 

Absolute value of a. 

The square root of a. 

Infinity. 


APPENDIX TABLES 


ZTOQyponw > 


. Ordinates and Areas of the Normal Curve 

. Critical Values of t 

. Critical Values of Chi Square 

. Critical Values of F 

. Transformation of r to 2, 

. Critical Values of the Correlation Coefficient 

. Critical Values of p, the Spearman Rank Correlation Coefficient 

. Probabilities Associated with Values as Large as Observed Values of S 


in the Kendall Rank Correlation Coefficient 


. Critical and Quasi-critical Lower-tail Values of W, (and their 


Probability Levels) for Wilcoxon's Signed-rank Test 


‚ Coefficients of Orthogonal Polynomials 


Critical Lower-tail Values of R, for Rank Test for Two Independent 


Samples 


. Critical Values of the Studentized Range Statistic 


M. 


Squares and Square Roots of Numbers from 1 to 1,000 


448 APPENDIX TABLES 


Table A Ordinates and areas of the normal curve* 
(In terms of с units) 


= 5 = = 
о drea Ordinate a Area Ordinate wo Area Ordinate 
00 — .0000 .3989 -50 .1915 3521 
.01 20080 13059 51 1950 “3508 107 И 2390 
02 1000 3059 -32 1985 13435 1.02 0361 12371 
-03 10120 "3983 -53 |2019 13467 1.03 23485 E 
04 — .0160 —— 3986 254 12054 E 1:04 — 3508 12323 
-05 — .0199 зо ES. 
.06 .0239 -3982 ce aoe 340 i Um 11278 
07 0279 `3ово 157 12157 Em 07 35% E 
08 — 10319 3977 -58 2190 3372 iu dI dan 
09 10350 — (3973 559 12224 13352 Lo Or 222, 
: x 3e 2 
-10 — .0398 3970 
n oes с «m ES -3332 1.10 3643 21719 
CN NES dom 3312 1.11 — 13665 -2155 
hs Ms Sa $om 32 102 “3686 2131 
AM 0557 -3951 ЖР a 32 2 53108 2207 
К 2389 3251 1.14 3729 2083 
-15 056 3945 
16 0636 73939 s 9422 3230 1.15 3749 -2059 
„{7 0675 .3932 vi 2484 3209 1.16 .3770 .2036 
480 .0714 3925 168 22 йі 1 i 12270 :2012 
49 20753 13918 т " - 8 зво .1989 
-69 .2549 3144 1.19 3830 +1965 
-20 0793 озду 2 
-21 0832 23902 E En 25123 1.20 3840 21942 
22 0871 .3894 -72 12642 {8101 1.21 3869 11919 
23 Соо 13885 -73 12673 00 ias 88 EE 
“4 (0480 “3876 a. 7 s . 3907 11872 
" е 7% 12703 23034 1-24 3995 11849 
| 0987 3861 
6 — .1006 {3857 28 2274 So 1425 1826 
По ой Ds] 97 12194 90 1:22 pn 
3 1103 -3836 М8 0% Е ҒА "1781 
E -2823 E 
| 29 ud 13825 29 12852 E 135 MSS 
. 11736 
30.1179 23814 i 
dio got :3802 A aie oe den 51714 
XL 22451,2 72575 82 tas 2574 hl -1691 
-33 11293 .3778 E pd 1582 +1669 
180 11331 13765 184 — 12995 1-43 91041 
. 134 
35 1368 3152 85  .3023 one 
136 11106 13739 к 2305 1.35.4115 1604 
137 1443 -3725 3 eich 1.36 5 
E 14131 11582 
-38 -1480 3712 88 1.37 4147 1561 
139 11517 3697 3 ais 1.38 |4162 11539 
E 1.39 
"e. TH Y 48177 -1518 
СЕТ :3668 E] 1.40 ais; 
422 11628 3653 92 GM 11 23202 1475 
02 xs 4207 11476 
Аз — 664 3637 14. 
4 iU i 581 .3238 ls 4222 +1456 
. 94 13264 . -4236 1435 
1.44 
45 1736 -3605 95 2370 a ios 
46 1772 13589 ` у 34501 1.45 
47 1808 3572 26 mu 2516 1.46 ds 51204 
 ИШЫТУН 13555 а 25540 -2492 1.47 “495 1874 
49 11879 3538 m 36 22468 КЕНІ 45% 
м 99° 13389 2444 heap 308 91434 
-50 ло 3521 1.00 зиз 2420 а E 


(Continued) > 


Table А 


х 
Area Ordinate = drea Ordinate = Area Ordinat. 


4938 .0175 


1.50 .4332 .1295 2.00 -4772 .0540 2.50 
1.51 .4345 11276 201 -4778 10529 2:51 14940 ЕКЕ 
1.52 .4357 +1257 2.02 .4783 0519 2.52 .4941 0167 
1.53 .4370 +1238 2.03 -4788 .0508 2.53 .4943 .0163 
.4382 .1219 2.04 .4793 .0498 2.54 .4945 .0158 
| 1.55 .4394 .1200 2.05  .4798 .0488 2.55 .4946 20154 
1.56 -4406 -1182 2.06 -4803 .0478 2.56 .4948 10151 
1.57 .4418 «1163 2.07 .4808 10468 2257 14949 10147 
1.58 .4429 1145 2.08 -4812 .0459 2.58 .4951 .0143 
1.59 «4441 .M27 2.09 .4817 .0449 2.59 .4952 .0139 
1.60 .4452 .1109 2.10 .4821 .0440 2.60 .4953 .0136 
1.61 4463 .1092 2.11 -4826 20431 2.61 .4955 .0132 
1,62 .4474 21074 2.12 .4830 .0422 2.62 .4956 10129 - 
1.63 .4484 -1057 213 — 4834 .0413 2.63 .4957 .0126 
1.64 .4495 -1040 2.14 -4838 -0404 2.64 .4959 .0122 
1.65 .4505 .1023 2.15  .4842 .0395 2.65 .4960 .0119 
1.66 — .4515 -1006 2.16 4846 10387 2.66 .4961 .0116 
1.67 .4525 -0989 2.17 -4850 -0379 2.67 .4962 .0113 
- 1.68 .4535 -0973 2.18 -4854 -0371 2.68 .4963 .0110 
1.69 4545 .0957 2.19 4857 .0363 2.69 .4964 10107 
1.70 .4554 .0940 2.20 -4861 .0355 2.70 .4965 0104 
1.71 .4564 0925 2.21 .4864 .0347 2.71 .4966 -0101 
1.72 .4573 -0909 2.22 .4868 -0339 2.72 .4967 10099 
1.73 .4582 -0893 2.23 -4871 .0332 2.73 4968 .0096 
1.74 — .4591 -0878 2.24 .4875 -0325 2.74 .4969 10093 
1.75 .4599 -0863 2.25 .4878 -0317 2.75 -4970 .0091 
1.76.4608 .0948 2.26 .4881 .0310 2.76 .4971 .0088 - 
1.77 .4616 .0833 2.27  .4884 .0303 2.71 .4972 .0086 
1.78 — .4625 .0818 2.28 .4887 -0297 2.78 .4913 10084 
1.79 .4633 .0804 2.29  .4890 .0290 2.79 .4974 :0081 
1.80 .4641 .0790 2.30 -4893 .0283 2.80 .4974 .0079 
1.81 — .4649 -0775 2.31 .4896 10277 2.81 -4975 .0077. 
1.82 .4656 .0761 2.32 ‚4898 .0270 2.82 .4976 .0075 
1.83 .4664 10748 2.33 4901 20264 2.83 .4971 -0073 
1.84 .4671 0734 2.34 .4904 .0258 2.84 .4977 -0071 
1.85  .4678 .0721 2.35 .4906 .0252 2.85 -4978 .0069 
1.86 .4686 .0707 2.36 -4909 .0246 2.86 .4979 +0067 
1.87 -4693 0694 2.37 -4911 -0241 2.87 .4979 0065 
1.88 .4699 +0681 2.38 -4913 .0235 2.88 -4980 .0063 
1.89 .4706 .0669 2.39 .4916 .0229 2.89 .4981 .0061 
1.90  .4713 .0656 2.40 -4918 .0224 2.90 .4981 .0060 
1.91 24719 .0644 2.41 4920 .0219 2.91 .4982 .0058 - 
| 1.92 -4726 .0632 242 -4922 .0213 2.92 .4982 0056 
1.93  .4732 .0620 2.43 .4925 .0208 2.93 .4983 .0055 
| 1.94 .4738 -0608 244 4927 0203 2.94 494 10053 
| 
.4744 .0596 2.45 -4929 .0198 2.95 .4984 .0051 
5 d ps .0584 2.46 -4931 10194 2.96 .4985 .0050 
| 1.97 .4756 .0573 2.41 .4932 .0189 2.97 .4985 10048. 
1.98 .4761 .0562 2.48 .4934 .0184 2.98 .4986 .0047 
1.99 .4767 -0551 2.49 -4936 0180 2.99 .4986 .0046 
4772 


ж т, ааа 
| 


APPENDIX TABLES 


Table В Critical values of t* 


Level of significance for one-tailed test 


-10 -05 -2025 :01 -005 .0005 


Level of significance for two-tailed test 


df 20 10 05 .02 01 001 
d 3.078 6.314 12.706 31.821 63.657 636.619 
2 1.886 2.920 4.303 6.965 9.925 31.598 
3 1.638 2.353 3.182 4.541 5.841 12.941 
4 1.533 2:132 2.776 3.747 4.604 8.610 
5 1.476 2.015 2.571 3.365 4.032 6.859 
6 1.440 1.943 2.447 3.143 3.707 5.959 
7 1.415 1.895 2.365 2.998 3.499 5.405 
8 1.397 1.860 2.306 2.896 3.355 5.041 
9 1.383 1.833 2.202 2.821 3.250 4.781 
10 1.372 1.812 2.228 2.764 3.169 4.587 
11 1.363 1.796 2.201 2.718 3.106 4.437 
12 1.356 1.782 2.179 2.681 3.055 4.318 
13 1.350 1.771 2.160 2.050 3.012 4.921 
14 1.345 1.761 2.145 2.624 2.977 4.140 
15 1.341 1:753 2:131 2.002 2.947 4.073 
16 1.337 1.746 2.120 2.583 2.921 4.015 
17 1.333 1.740 2.110 2.567 2.898 3.908 
18 1.330 1.734 2.101 2.552 2.878 3.922 
19 1.328 1.729 2.093 2.539 2.801 3.883 

20 1.325 1.725 2.086 2.528 2.845 3.850 
21 1.323 1.721 2.080 2.518 2.831 
22 1.321 1.717 2.074 2.508 2.819 3:792 
23 1.319 1.714 2.069 2.500 2.807 3.707 
24 1.318 1.711 2.064 2.492 2.797 3.745 
25 1.316 1.708 2.060 2.485 2.787 3.725 
26 1.315 1.706 2.056 2.479 2.77 
27 1.314 1.703 2.052 2.473 2L i 
28 1.313 1.701 2.048 2.467 2.703 3.074 
29 Lan 1.699 2.045 2.462 2.756 3.659 
-310 1.697 2.042 2.457 2.750 3.646 
40 1.303 1.684 2.021 2.4 
60 1.296 1.671 2.000 2:300 аа 240 
120 1.289 1:658 1.980 2.358 ; ў 
= 1.282 13 1; 6 3 
* Abridged from 


and medical rese 
publishers, 


т> к 


APPENDIX TABLES 


Table С Critical values of chi square* 


Probability under H, that x2 > chi square 


5.41| 6.64 
7.82| 9.21 
9.84 11.34 
11.67 13.28) 
13.39 15.09 


e| .87 [1.13 | 1.64 | 2.20 | 3.07 | 3.83 10.64 12.59|15.03|16.81 
т| 1.24 [1.56 |2.17 | 2.83 | 3.82 | 4.67 12.02|14.07 |16.62|18.48 
в 1.65 |2.03 | 2.73 | 3.49 | 4.59 | 5.53 13.36/15.51 18.17 20.09 
o| 2.09 |2.53 |3.32 | 4.17 | 5.38 | 6.30) 8. 14.68/16.92/19.68/21.67 
10] 2.56 | 3.08 | 3.94 | 4.86 | 6.18 | 7.27 15.99/18,31/21.16|23.21 
11| 3.05 |3.61 | 4.58 | 5.58 | 6.09 |8 12.90/14.63/17.28/19.08/22.02/24.72 
12|3.57 |4.18 |5.23 | 6.30 | 7.81 | 9.03|11.34/14.01/15.81/18.5521.03 24.05 20.22 
13| 4.11 | 4.76 | 5.89 | 7.04 | 8.63 | 9.93/12.34/15.12/10.98/19.81 22.36 28.47 27.69 
14| 4.66 |537 | 6.87 | 7.79 | 9.47 |10.82/13.34/16.22|18.15]21.06|23.68/26.87|29.14 
15| 5.23 | 6.98 | 7.26 | 8.55 [10.31 |11.72/14.34|17.32/19.31/22.31/25.00 28.26/30.58 
16| 5.81 | 6.61 | 7.96 | 9.31 [11.15 15.3418.42|20.46|23.54|26.30/29.93/82.00 
17| 6.41 |7.26 | 8.67 |10.08 |12.00 10.34|19.51/21.62/24.77|27.59|31.00|33.41 
18|7.02 | 7.91 | 9.39 |10.86 |12.86 17.34/20.00/22.70/25.90/28.87 32.35|34.80 


8.57 [10.12 [11.65 |13.72 |15.35|18.34|21.69/23.90/27.20/30. 14 33.69 36. 10. 
10.85 |12.44 |14.58 |16.27|19.34|22.78|25.04|28.41 e 


21 17.18|20.34|23.86|26.17|29.62|32.67 6.34 8.034 

18.10]21.34|24.9427.3030.81 33.92 37.66 40.29 
19.02/22 .34/26.02]28.43|32.01|35.17 38.07 41.64. 
19.94 |23 .34/27.1029.5533.20 36.42 40.27 42.08 
20.87/24 .34/28.17/30.68/34.38/37.65 41.57 44.31 


21.79/25.34|29.25|31.80/35.50/88.88 42.80 45.64 
22.72]26.34 30.32 32.01|36.74 40. 11 44. 14 40.06 
23 .65|27.34|31.3934.03/37 .92|41.34 5.42 48.28. 
24.58|28.34|32.40/35. 14/39.09|42. 56 10. 09 49.59 
33.53|36.25/40. 26 43.77 47.9050. 89} 


Abridged from Table IV of R. A. Fisher and F. Yates: Statistical tables for biological, agricultu 


and medical research, published by Oliver & Boyd, Ltd., Edinburgh, by permission of the authors ai 


publishers. 


207% 017% It» er's ory ӘР”? $9"? 6979 92% в 


0978 2976 99'S 01° элс 08%; 9876 pe's 
OFT |1972 |292 (5876 Lez OS 662 462 19-2 SOE |02 WI GUI 2852 98-2 |0652 862 102 86'E 8p 
|86'8 |96°5 107» 09°F TL" 827% 98°F |0679 9079 15-9 99'L vo'Or 
HE |92 |6872 | 28° /98:2 1622 8622 |4652 |2056 10-Е "el d 96v | or 
879 00'9 II-0 9T-9 95" 0679 17-0 59-9 | 5078 99°or | 
ғ razis | 6 


166° |9679 
еш Т 
99'P (8079 162% 
Ғ672 %672 |962 


| 
72 |8672 /207ғ |4026 [ore [ЕТ ere ғғ ІЗ 
| [ ] | ! 
"9 | 929 1972 »L'0 BB's |1672 609 
ГЕ [0276 (ЕСЕ |8226 ТЕР [pE E бег №. 
9179 1579 0Е°9 1979 |y973 Е979 14-9 ув- 
PE'E lowe 2$°© 11676 09° 59-Е 89° gp 
| ! | 


bd 


6872 |8071 097, |н ГАД ва | 
287 T6'F 96'f 00° 50° |9029 lot 


| | 
89°6 LL'6 |69"6 96-6 |90-от|от"От L5"0T 9-0 
09°F отр |89" | OL" [РЕ | СФ [ze v | Be" s 
OP" БТР "СТ БО БТ zo"r ro УТ ТРТУ "УТ 6 "РТ 0" P|99* 7,9979 08" 
9$ юз | `$ [995 89° 5 |0875 /8°$ |16°5 | 6-5 |96°s |0029 |9 | 60°9 о t 
| | | 
Lx T'98/9T "95 [55-95 167 IY 96 0995/09 92/69-97 58-95 t6:9t/00 LE СТ EE" E v6" 16 GV "16 191. T6", 
5828 МВ [5^9 оста us 09°8 |2978 |8928 |998 |69°8_ күк 8478 |1878 |eg 8 |88:8 to'g ғ 
| | | | 
onos! 9:66/6Р:66 SY 66 67 "soj 87:66 L9 "66/9966 07-66 97-66 52-66 E966 Т9"66 0966 0с ; л 
ААТ зае ФЕ СОГ 9Ё°б1'ЕЕ 6N OF z 


9909 |2209 hia а = 


0861|05:61/6%:61/6%:61 8-61 үө бб | 
бес Ме мг 


9959 |т9го 5969 | 259 гә 5059 99t9 ы эн |8059 |69179 СУТО |9019 7909 
ША |м< St lesz esz |280 160 psz 1672 (Бре [2 $2 Jr 4 


|92 ім 


wo XE ORE CNT S2. Ap 0 еж dE ot м oa iUt gr o3. & 
изәш 
— == —=—— жәнә) 20у 
ambs unau samast sof шорәәл/ fo saoaxoq шорәәау 
Јо зәміәд 
92 ләй с 


ed jo uonnquisip oq лоу зиноа (әйікі 9»oup-pjoq) 1139 ләй | рие (od 1 чешол) зи, 
"939 ton onu а чек, 


(Continued) 


Table D 


28 Д% 5$ Бы се se ae ж E 
85 85 28 85 58 88 35 $$ 53 58 25 $3 xx 
DI чю чы че я -4 За са 44 Cd да да 
52 чо та ваз == се Ss o T = 
22 sa 18 88 ЗЕ 55 88 88 83 Ыз SH Ka Ещ 
AG cid dé dd da ae ca od cd За 4 ca ta 
8S XS 98 SS Z8 55 58 zZ кк sa c 
15 18 12 18 58 SB 28 22 55 39 55 RH SR 
NQ о мо NA ча че а -d 44 -d 44 я да 
по Sk сч A се do ә то m вы 
33 ЗЫ 54 S£ 88 58 58 55 59 85 23 
NO чо сч ча -d -а -d -d -d За -а 
зе че = “да әз зо a - on = 5 | 
EY EE OR SE SH 38 58 БЕ 53 59 НЕ 
Wu 44 Ha HH cix да oe ce a Ca 
оо ne кә сә о eo о -ө wc $- 
58 55 $8 58 8Е 58 58 58 23 зу 
че 44 SW de ва са 44 24 са Se 
әз та -a сә “же со me -s co 
$3 53 28 БЕ $9? gu 58 ЕЗ 
54 42 ча ча была са NR 
эе ым = л ELI zt я мы Sa So 
$8 $3 8 ЯН ma 52 sk SB SE 58 2% 
We чш 4 44 dà аа ded da Od 2ң 
рә a 2 eoe лә =a 728 mo со “9 
38 9% а 28 98 =8 88 58 82 28 
sd 54 4 cid ыя 99 ба ва: ча oe 

гы ло шә аж 

ЗЕ ЗЕ z ag 25 28 $8 58 EE] 
Че чы We d ne de sd dé че 44 ced 
| ae лы ме =A ме mw 2 
88 GE 18 Z9 35 АБ 55 aS 58 98 rg 28 
NO cd NO dà со че dd dà do Nd сс DI 
ә v ә 58 me” of va пе Se ха те 
$8 92 SE 98 53 528 ЕҢ NA RA 85 55 za 
42 44 че da dé dé dé dé dd dé на ЗӨ 

- 5 сө сө ты -о am ET же 
$5 55 58 23 55 58 АЩ ая 33 
4: v4 Wd dé dé ud чш чы чш 

aa sS 9S B 5$ SS 23 58 аа 
ЕН SE 95 Х8 5554 58 83 88 
44 sid We 44 we ns 46 Ns че 

са ча о осп сө Oe ге $t 
58 98 38 98 52 59 55 88 85 
чм ей 4% че в uo чо No че 

“oa се = ою со wa ев лә oe 
88 $8 GE 58 39 25 33 58 88 
44 44 cid че dà че че ss ss 

"1-3 “ше == сә че ce хе со 

28 $8 58 ИР ur 52 58 $3 55 33 
“4 44 о че о че че 4% бө ao aa 
q d = жә ve яч һо Яя mo 
3 ЕЗ $8 58 58 2E ОБ 58 952 28 
б ad 44 че че 44 S4 жн ds 
= Е: [ эн 7*7 мә че -ы 

ЕҢ FR ЕЗ 35 За BE RE 25 

52 45 82 Wal HW SO WW 99 

71 су пе 7e зе 

38 S$ 54 EB 55 2% 

EM 44 av 44 «5 d$ че 
= да за се ос me сө c e 

5: 58 Z8 88 sk $5 58 58 га 
қа ыш на 84 54 dw ыч ыч 45 

o6 x те ою c тд се 

544 ag Ra 58 25 55 
ae sé 46 46 Ba 
әз чо con 
$8 55 28 28 5% 
42 We оле ae ae 
Е “бе 18 ss 58 
Га 58 SS за 54 xe 
«а EI vO <0 +a ты 
= 
әз S & 9 S 9 9v 8 сп о 5 


ty Press, Ames. lowa, 1967. 


iversit 


ate Uni 


ran, Statistical methods, 6th ed., lowa 5 


and William € 


„ from С. W. Snede 


ission, 


* Reprinted, by permi 


ooz 


98°T |T6'T 
ЖІГІТІ 
988'I |»6'T 
667116671 
0677 |2677 
4671 |0971 
9671 |0072 
6671 |2971 


8671 |vo't 
1971 +9°т 
to's |8075 
99'1 |1971 
1075 |er't 
9971 |6971 
0775 |9772 
BOT TLE 


ча 


SI'U [ets 
169 `1 [čti 


че 


ӨТЕ ТЕБ 
ист вет 


sz 


и 


6772 |95°5 
"сл OLE 


СЕ |625 
лга i sc 
dum e 
unau 


че че 


42812] 20у 


eannbe изәш 421243 40f шорәәау fo 29492] 


wopaoaf 
fo толя 


(ponunuo)) q arqeg, 


y9'9 
LN 
99'9 
see 
oL'9 
98'E 
9L'9 
68'E 
1879 
бє 
»9'9 
26% 
0679 
to'e 
9679 
966 
TOL 
86° 
97. 
06% 
807, 
007% 
tUL 
70% 
LUL 
£0'v 
ть 
ғо» 
TUL 
507% 


(рәпициод) 


oor 


002 


054 


08 


0; 


$9 


09 


ss 


os 


8 


9v 


а эче 


456 


APPENDIX TABLES 


Table Е Transformation of r to =,* 


Ж РА ғ =, г =. г z, r 5, 
= ————__ алыл 
.000 .000 200 .203 400 424 600 .693 .800 1.099 
005 .005 205 .208 405.430 605 .701 805 1.113 
010 .010 210 .213 -410  .436 610 .709 810 1.127 
015 .015 215 .218 .415 (442 615 .717 .815 1.142 
020 .020 220 .224 420 .448 620 .725 .820 1.157 
1025 -454 625 733 825 1.172 
1030 -460 .630 — .741 :830 1.188 
1035 -466 -635 750 .835 1.204 
“040 .472 .640 — .758 .840 1.221 
.045 .478 2645 .767 .845 1.238 
.050 .485 650 — .775 .850 1.256 
.055 491 655 .784 .855 1.274 
.060 .497 660 .793 .860 1.293 
.065 504 665 .802 .865 1.313 
.070 .510 670 811 .870 1,333 
.075 1517 675 .820 875 1.354 
.080 .523 680 .829 :880 1.376 
.085 .530 685 838 .885 1.398 
.090 .536 690 — .848 .800 1.422 
.095 543 695 .858 .805 1.447 
.100 .549 1700 .867 .900 1.472 
.105 556 -705  .877 905 1.499 
110 563 710 .887 910 1.528 

.115 570 715 .897 .915 1.557 

ў .120 576 720 .908 .920 1.589 
.125 583 725 .918 925 1.623 
.130 :590 730 .929 930 71.658 
1135 .597 -735 940 935 1.697 
440 .604 .740 — .950 940 1.738 
145 611 -745 962 945 1.783 
150 .618 750 .973 950 1.832 
155 626 -755 .984 -955 1.886 
- 160 . 633 .760 .996 .960 1.946 
3402 са AES i 008 -965 2.014 
. А 020 .970 2.092 
-175 .655 775 1.033 975 2.185 
.180 662 780 1.045 -980 2.298 
.185 -670 -785 1.058 985 2.443 
.190 1678 2% 1.071 1990 2647 
+195 685 795 1.085 995 2.994 

* Reprinte 


» Statistical methods, 2nd е 


- Holt, Rinehart, and Win- 


^" 


APPENDIX TABLES 4: 


Table F Critical values of the correlation coefficient* 


Level of significance for one-tailed test 


.05 .01 


.025 .005 


Level of significance for two-tailed test 


оомо олњ ом 


Abridged from В. A. Fisher and Е. Yates, Statistical tables for biological, 
agricultural, and medical research, Oliver & Boyd, Ltd., Edinburgh, by 
permission of the authors and publishers. 


. 


; ' uL. APPENDIX TABLES 
1. Tabie G Critical values of p, the Spearman rank corre- 
> 


lation coefficient* 
1 


| 


Significance level (one-tailed test) 


N .05 .01 
Аж, 
E а 
4 1.000 
5 1900 1.000 
6 .829 .943 
1 -714 -893 
8 .643 833 
9 -600 783 
10 .564 746 
12 -506 .n2 
14 .456 645 
16 .425 601 
.564 
.534 
-508 
-485 
-465 
-448 
432 


Adapted from E, G, Olds, 
‚ ӨГ rank differences for 
* Annals of Mathematical § 


Distributions of sums of squares, 
small numbers of individuals, 
Statistics, 9, 133-148, 1938; The 
T sums of squares of rank dif. 
ferences and a correction, Annals of Mathematical Sta- 
tistics, 20, 117-118, 1949: with the kind permission of the 
author and the publisher, 


5% significance levels foi 


Table H 


APPENDIX TABLES 4 
2 


Probabilities associated with values as large as observed values of S т the 


Kendall rank correlation coefficient* 


Values of N Values of N 
5 4 5 8 9 5 6 ri 10 
еее са Ен А. ГС — n eee 
0 -625 .592 .548 .540 1 .500 -500 .500 
2 375 .408 .452 .460 3 .360 .386 .431 
4 -167 .242 .360 .381 5 .235 .281 .364 
6 .042 .117 .274 .306 7 136 191 300 
8 -042 .199 .238 9 068 .19 242 
10 .0083 .138 .179 1 .028 .068 .190 
12 .089 .130 13 -0083 .035 .146 
14 .054 .090 15 -0014 .015 .108 
16 .031 .060 17 .0054 .078 
18 .016 .038 19 .0014 .054 
20 .0071 .022 21 .00020 .036 
22 .0028 .012 23 .023 
24 .00087 .0063 25 .014 
26 .00019 .0029 27 .0083 
28 .000025 .0012 29 , 0046 
[ 
30 .00043 31 -0023 
32 .00012 33 қ „0011 ~u 
34 ? .000025 35 -00047 
36 .0000028 37 .00018 
39 .000058 
41 .000015 
43 -0000028 - 
45 -00000028 


* Adapted by permission from М. С. Kendall, Rank correlation methods, 2d ed., Charles Griffin & Co., | 
Ltd., London, 1955. 


460 


ы Анно с Р 


“е. 


Table I 


APPENDIX TABLES 


Critical and quasi-critical lower-tail values of W, (and their probability levels) for 
Wilcoxon’s signed-rank test* 


N a=.05 а= .025 а= .01 = .005 


5 0 .0313 
1 .0625 
6 2 .0469 0 .0156 
3 .0781 1 .0313 
7 3 .0391 2 .0234 0 .0078 
4 .0547 3 .0391 l .0156 
8 5 .0391 3 .0195 l .0078 0 .0039 
6 .0547 4 .0273 2 .0117 1 .0078 
9 8 .0488 5 .0195 3 .0098 l .0039 
9 .0645 6 .0273 4 .0137 2 .0059 
10 10 .0420 8 .0244 5 .0098 3 .0049 
11 .0527 9 .0322 6 .0137 4 .0068 
11 13 .0415 10 .0210 7 .0093 5 .0049 
14 .0508 ll .0269 8 .0122 6 .0068 
12 17 .0461 13 .0212 9 .0081 7 .0046 
18 .0549 14 .0261 10 .0105 8 .0061 
13 21 .0471 17 .0239 12 .0085 9 .0040 
22 .0549 18 .0287 13 .0107 10 .0052 
14 25 .0453 21 .0247 15 .0083 12 .0043 
26 .0520 22 .0290 16 .0101 13 .0054 
15 30 .0473 25 .0240 19 .0090 15 .0042 
31 .0535 26 .0277 20 .0108 16 .0051 
16 35 .0467 29 .0222 23 .0091 19 .0046 
36 .0523 30 .0253 24 .0107 20 .0055 
17 4l .0492 34 .0224 27 .0087 23 .0047 
42 .0544 35 .0253 28 .0101 


Table I (Continued) 


N а=.05 а=.025 а=1005 
a eee 
21 67 .0479 58 .0230 49 .0097 
68 .0516 59 .0251 50 .0108 
22 75 .0492 65 .0231 55 .0095 
76 .0527 66 .0250 56 .0104 
23 83 .0490 73 .0242 62 .0098 
84 .0523 74 .0261 63 .0107 
24 91 .0475 81 .0245 69 .0097 
92 .0505 82 .0263 70 .0106 
25 100 .0479 89 .0241 76 .0094 
101 .0507 90 .0258 77 .0101 
26 110 .0497 98 .0247 84 .0095 
111 .0524 99 .0263 85 .0102 
27 119 .0477 107 .0246 92 .0093 
120 .0502 108 .0260 93 .0100 
28 130 .0496 116 .0239 101 .0096 
131 .0521 117 .0252 102 .0102 
29 140 .0482 126 .0240 110 .0095 
141 .0504 127 .0253 111 .0101 
30 151 .0481 137 .0249 120 .0098 ( 
152 .0502 138 .0261 121 .0104 [ 
31 163 .0491 147 .0239 130 .0099 { 
164 .0512 148 .0251 131 .0105 à 
32 175 .0492 159 .0249 140 .0097 { 
176 .0512 160 .0260 141 .0103 Е 
33 187 .0485 170 .0242 151 .0099 . 
188 .0503 171 .0253 152 .0104 | 
34 200 .0488 182 .0242 162 .0098 | 
201 .0506 183 .0252 163 .0103 : 
35 213 .0484 195 .0247 173 .0096 : 
214 .0501 196 .0257 174 .0100 : 1 
36 227 .0489 208 .0248 185 .0096 4 
228 .0505 209 .0258 186 .0100 4 
a7 241 .0487 221 .0245 198 .0099 i 
242 .0503 222 .0254 199 .0103 1 
38 256 .0493 235 .0247 211 .0099 à | 
257 .0509 236 .0256 212 .0104 1 | 
39 271 .0493 249 .0246 224 .0099 { 1 
272 .0507 250 .0254 225 .0103 à j 
40 286 .0486 264 .0249 238 .0100 : 
287 .0500 265 .0257 239 .0104 | 
m 1s 


| APPENDIX TABLES 
Г 


Critical lower-tail values of R, for rank test for two independent samples (N, < №)“ 
гіс: ы 


N,=2 Е : 
270 R 5 0.010 0.025 0.05 0.10 ой, м, 
з 0.05 0.10 ой, 0.001 0.005 0. 
, 0.00! 0.005 0.010 0.025 0. 

: - 0 2 
ү n 3 я s 
? 3 4 16 5 
3 4 в 6 
; = 3 4 7 
са” 3 3 5 в 
1 п 3 Н 5 5 
х 3 3 4 6 10 
3 4 6 п 
* с трав 
1 15 $ 2 3 7 ив 
1 16 : : 3 1 M 13 
1 7 $ 2 8 Я DEC 
108 3 A е a а 16 
| om 2 5 6 9 4 17 
тан яв 500709 gis 

1 P ; 4 5 7 

1 2 0% к 1 2 1 

1 2 B йб x 5 E 

1 2 2 $ 4 g a 

1 2/5 t $ 8 қ 

1 20% 3 4 А 3 

= = - 1 200 = 3 4 6 ; 


N-3 E Мт4 
№ 0.001 0.005 0.010 0.025 0.05 оло ой, 0.001 0.005 0.010 0. 25 


6 7 2 
E 6 7 2+ - 10 n 13 36 4 
6 7 в 2 - 10 n n n % 5 
- 7 8 9 30 10 n 12 13 15 и 6 
6 7 8 10 з 10 n 13 n 16 в т 
y - 6 в 9 и 36 и 12 n 15 17 52 8 
6 7 8 ю и 39 E n" 13 nu 16 19 55 9 
6 1 9 и о g 10 12 13 15 п 20 60 10 
6 il 9 n 13 45 10 12 14 16 18 21 Ы 11 
1 8 10 n no зз 10 13 15 17 юй 22 68 12 
7 8 10 2 5 я n 13 15 18 20 5 72 13 
7 8 T з s n n 16 19 21 25 16 14 
8 9 u 13 16 57 n 15 п 20 22 26 80 15 
8 9 12 KW и @ 12 15 7 21 з oF в 16 
8 10 12 5 в е 12 16 18 21 25 æ вв 17 
8 13 5 9 6 13 16 19 22 2% 30 — 9» ІВ 
tj 13 6 æ% о 13 17 19 23 7 з % 19 
9 n п om 12 13 18 20 2% 2 32 100 20 
9 14 R. a 15 14 18 21 25 9 з а 
15 в 28 зв м 19 21 26 30 35 108 22 
15 9 23 81 м 19 22 21 ЕЛ 36 112 23 
16 D м м 5 20 23 2 32 за 16 24 
16 20 25 87 15 20 23 28 33 38 120 25 


is table is reproduced, with changes in notation, from Table I in L. 


В. Verdooren, Extended table 
eoxon's test statistic. Biometrika. 50. 177-186. 1963, 


with permission of the author and editor. 


ficant at the probability level 
1 A blank space indicates that no value is significant at that level; The ир 


5 of critical values Гог Wil- 


quoted at the head of column 
per critical value is 24, — R, 


twice the mean 


APPENDIX TABLES 465 


Table K (Continued) 4 


қ мМ=5 = 
М: 0.001 0.005 0.010 0.025 0.05 0.10 2R, 0.001 0.005 0.010 0. 


6 
25 0.05 0.10 2Ř, 


ж 
5 15 16 и 9 2 5 
6 lo In 18 ж z 60 = 23 2 26 
7 = 16 18 20 21 з бу 21 2 25 27 5 2 
è 15 17 19 21 з 25 70 22 25 27 29 a з 
9 16 18 22 з 2 15 23 26 28 31 33 3% 
P 16 19 23 5 з 8 En 27 29 32 5 38 102 
17 20 т п w 8 25 

15 И 21 26 æ 3 9 25 30 2 E E 5 We 
1з 18 22 27 0 зз 95 26 з1 33 37 3 u 120 
и 18 22 28 31 35 100 27 32 E 38 42 в 1% 
18 s 23 29 з 35 15 28 33 36 30 нов 132 
6 2 2 27 30 з — 3 10 29 з 37 42 
17 20 25 28 32 з 30 15 30 36 39 43 a = ie 
18 2 26 2 33 йз 42 12 E 37 40 45 39 55 150 
12 22 Т 30 3 з 8 15 32 38 E 46 51 5т 15 
* 2 28 al 35 ю 45 10 33 39 43 48 8009 162 
2 23 29 32 37 7 135 33 E 50 55 
22 23 29 33 38 з 48 щш 3 E) 45 51 57 ө In 
23 2+ 30 39 50 145 35 43 47 53 58 6 180 
24 25 з 35 40 4 ы 19 36 48 5% ө 67 186 
25 5 32 36 42 и 53 15 37 a5 50 56 62 — 69 192 

N=7 А. N-18 Y 
у, 0.001 0.005 0.010 0. 0.05 0.10 2R, 0.001 0.005 0.010 0.025 0.05 оло >й, М 
_ — ——-—-—-— 
т m5 32 м 36 3 и 15 
8 з з 35 38 E и m 40 5 51 55 
9 aM 35 37 40 з 4 19 a 45 s > 5 58 
10 3 37 39 42 $ W 1% a2 и 49 53 55 60 
п 38 40 u 47 ЕЛ 133 a 39 51 55 59 63 
12 35 40 42 46 9 5+ 140 45 5 53 58 62 6 
13 36 a 48 52 56 147 т 53 56 60 ө 69 
14 3 m 45 50 5$ 59 15 58 62 67 72 
15 38 и 52 56 61 161 50 56 60 65 69 15 
16 39 46 49 54 58 [2] 168 51 58 62 Ш 12 78 
171004 7 EI 56 6 6 15 53 60 64 т 15 81 
8 42 49 52 58 6 6 182 62 66 72 т 84 
19 43 50 5% 60 65 7 189 56 oF 68 n 80 87 
20 LU 52 56 62 67 74 1% 57 66 70 тї вз 90 
21 46 53 58 oF ө 16 203 59 в 72 19 в оз 
22 47 55 59 66 32 19 20 60 70 24 в! 88 95 
23 an 57 6l 68 74 а 217 62 71 76 84 90 98 
24 9 58 63 70 6 м 2% oF 3 78 85 93 101 
25 50 60 oF 72 78 86 231 65 15 al 89 96 104 

М-9 ы Жо. zo 
М, 0.0001 0.005 0.010 0.025 0.05 0.10 2R, 0.001 0.005 0.010 0.025 0.05 0.10 2R М, 
9 52 56 59 62 66 10 mi M 

65 т 74 78 


58 ы 


B } Fue 
, WW Аё 7 v 
ак d > * H t) р 


p А APPENDIX TABLES 
66 


fable K (Continued) 


0.05 оло 2Ř, 0.001 0.005 0.010 0.025 0.05 оло эй, А 
9 % 100 1% 253 
oF 99 10% 110 264 105 109 15 120 127 12 
97 юз 108 15 275 109 из 19 125 131 13 
100 106 — 2 18 26 п? 116 123 129 136 14 
103 no 06 13 297 ns 120 127 B3 м 15 
107 n3 120 127 эш 119 124 131 138 — M5 16 
110 1]? 13 131 309 122 127 135 142 150 17 
из 21 127 135 330 125 ізі 139 46 155 18 
16 ш 131 139 зи 129 134 143 150 159 19 
19 128 135 м 35 132 138 17 155 16 20 
123 131 139 H8 зз 136 142 151 159 169 21 
126 135 — M3 152 34 139 145 155 163 173 22 
129 39 мт 156 385 142 149 159 168 178 23 
132 142 151 161 396 146 153 163 172 183 24 
136 ие 155 165 407 19 156 167 176 187 25 


N=14 

9.001 0.005 0.010 0.025 0.05 оло ай) м, 
130 35 42 мә as 
134 17154364 137 nu 152 160 166 175 406 14 
138 MS 152 19 зт m 151 156 16 171 19 420 15 
142 150 156 165 зю 144 155 161 10 16 185 — 43 16 
146 154 16 10 — 30 148 159 165 174 182 190 из 17 
150 158 166 115 йб 151 163 170 1 187 1% 462 18 
154 вз т 180 429 155 168 174 183 19 x2 4% 19 
158 167 175 185 w 159 172 178 188 97 207 40% 20 
162 11 180 190 455 162 176 183 93 202 23 s% 21 
166 176 185 195 48 166 180 187 198 E 218 518 22 
170 180 189 20 даі 169 183 192 23 22 2% 532 23 
174 185 194 205 49 173 188 196 97 28 29 56 24 
178 19 19 2») ë 177 


192 200 212 223 235 560 25 


№ =15 


y = Мұ-16 
Мі 0001 0.005 0.010 0.025 0.05 оло әй, 9.001 0.005 0.010 0555, 0.05 оло эң, м, 
.025 0. . A 
2 0-1" 6 18 эз 0 № 
16 — 163 175 181 190 197 206 480 184 196 202 
о "9 . 195/195 м 22 4 в ж жт 21 ES 20 — som 1 
ш im m m m m m ыы 722, a чт 
14 24 5% 16 2 2 ў 
20 179 193 200 210 220 230 540 201 215 223 234 ni = dh 25 
21 зз эю ж б о ш ы Б % VH. our d 1 
ERU о со л э № 55 ОБ. n5. 35 x O 
оо ж 1% Ve pou. 24 5% м ж © м €4 22 
24 195 2 219 231 242 254 600 218 235 24 256 274 640 23 
Ш5 09 26 24 м з м № Cep LONE 726 34 
LT 
Pino R № =18 
9.010 0.025 0.05 0.10 ай, 0.001 0.005 ооо 0022 9.05 оло ой, м, 
ы . "A 
20 20 29 x9 5% 
25 26 — 255 26 62 237. "253 
241 252 % 213 69 242 238 = i 20 221 56 18 
26 258 268 280 66 м, om сш Ш Р 730 
252 24 28 з бз 0° в т a0 б. Eus. Sanh ат 
28 20 — 281 291 680 Bii CCS CE md m 
23 26 Ж 30 697 LE NE NE qc Es 
20 22 295 307 m 2571 Ж% ы № Аң 4 
215 288 — 300 34 — 73 273 292 SS, 74 2 


301 316 328 343 192 25 


APPENDIX TABLES 4 


Table К (Continued) 
N,=19 а М-20 
М, 0.001 0.005 0.010 0.025 0.05 010 2R, 0.001 0.005 0.010 0.025 0.05 0.10 28, 
19 267 283 291 33 зз 35 ти 
20 272 289 297 39 39 333 160 298 315 324 337 за зы 
21 27 295 303 36 зз — 38 79 E 322 331 34 356 310 
22 283 301 310 323 335 39 798 309 328 337 350 365 378 
23 288 307 316 эю 32 357 вт 315 335 зи 359 371 386 
24 29 313 323 337 50 361 836 321 3n 351 366 379 394 
25 29 319 329 зи 357 32 855 327 318 358 33 387 403 
N-21 zs „=2: 
М, 0.001 0.005 0.010 0.025 0.05 0.10 28, 0.001 0.005 0.010 0.025 0.05 0.10 
21 31 349 359 33 385 39 90 
22 зт 356 366 381 393 408 9% 365 386 396 an 424 439 
23 33 363 373 388 401 47 95 372 393 403 49 42 ив 
24 39 370 381 306 — 410 425 966 379 400 407 M 4517 
25 356 37 388 404 418 4% 987 385 408 419 435 450 — 467 


№ =23 
№ 0.001 0.005 0.010 0.025 0.05 0.10 2 
M 
415 


2з 402 an аз 451 


24 409 431 ыз 459 
25 416 439 451 468 
№, =25 


№ 0,001 0.005 0.010 0.025 


25 480 505 517 536 


465 481 
474 491 
483 500 
0.05 0.10 
552 50 


1081 
1104 
1127 


2R, 


1275 


Log 
448 


N,-24 vi 
/0.000 0.005 0.010 0.025 0.05 0.10 2R, 


492 
501 


176 24 
1200 25 


507 525 
67) 539 


468 APPENDIX TABLES 
; Table L Critical values of the Studentized range statistic* 


Е = number of means ог steps between ordered means 


dfforS? 1-а 2 сі 4 > Ы i z x 29 
4 1 95 18.0 210 32.8 37.1 43.1 
Я 99 90.0 135 164 186 216 
И -. 95 6.09 8.3 9.8 10.9 12.4 
99 14.0 19.0 22.3 24.7 28.2 
3 95 4.50 5.91 6.82 7.50 8.48 
99 826 106 12.2 13.3 15.0 
4 95 3.93 5.04 5.76 6.29 7.05 
99 6.51 8.12 9.17 9.96 ил 
5 95 3.64 4.60 5.22 5.67 6.33 
99 5.70 6.97 7.80 8.42 9.32 
Am 95 3.46 434 4.90 531 5.89 
99 5.24 6.33 7.03 7.56 8.32 
7 95 3.34 4.16 4.69 5.06 5.61 
у: 99 4.95 5.92 6.54 7.01 7.68 
< 8 95 3.26 4.04 4.53 4.89 5.40 
1 г. 99 4.14 5.63 6.20 6.63 7.24 
T 1 9 .95 3.20 3.95 4.42 4.76 5.24 
.99 4.60 5.43 5.96 6.35 6.91 
1 10 95 3.15 3.88 4.33 4.65 5.12 
ІШ 99 4.48 5.27 5.77 6.14 6.67 
— "JA 95 3.11 3.82 4.26 4.57 5.03 
va 99 4.39 5.14 5.62 5.97 6.48 
" 12 95 3.08 3.77 4.20 451 4.95 
Ы 99 4.32 5.04 5.50 5.84 6.32 
Д 13 95 3.06 3.73 4.15 4.45 4.88 
99 4.26 4.96 5.40 5.73 6.19 
n 95 3.03 3.70 ап 441 4.83 
99 4.21 4.89 5.32 5.63 6.08 
16 95 3.00 3.65 4.05 4.33 4.74 
99 4.13 4.78 5.19 5.49 5.92 
% 18 95 2.97 3.61 4.00 4.28 4.67 
99 4.07 4.70 5.09 5.38 5.79 
20 95 2.95 3.58 3.96 4.23 4.62 
I^ 99 4.02 4.64 5.02 5.29 5.69 
n м 95 2.92 3.53 3.90 4.17 4.54 - 
А 99 3.96 4.54 4.91 5.17 5.54 
30 95 2.89 3.49 3.84 4.10 4.46 
99 3.89 4.45 4.80 5.05 5.40 
40 95 2.86 3.44 3.79 4.04 4.39 
99 3.82 4.37 4.70 4.93 5.27 
60 95 2.83 3.40 3.74 3.98 431 
99 3.76 4.28 4.60 4.82 5.13 
120 95 2.80 3.36 3.69 3.92 454 
99 3.70 4.20 4.50 4.71 5.01 
= 95 2.77 3.31 3.63 3.86 4.17 
99 3.64 4.12 4.40 4.66 4.88 


* This table abridged from Table 1 ttegrals of the range and of the 
Leon Harter, Donald $. Clem ‚ and Eugene Н. Guthrie. These tables are published in WA) 
Wright Air Development Center, and are reproduced with the kind Permission of the auth 

i . 


Studentized range, 


Prepared by Н. 
DC tech, Rep. 58-484, vol. 2, 1959, 


ors, 


APPENDIX TABLES 469 


Table M Squares and square roots of numbers from 1 to 1,000* 


N № VN VION N № VN VION 
pc ee ea ИИ 
1 1 1.0000 3.1623 51 26 01 7.1414 22.5832 
2 4 14142 44721 52 27 04 7.2111 22.8085 
3 9 17921 5.4772 53 28 09 т.2801 23.0217 
4 16 2.0000 6.3246 54 2916 7.3485 23.2379 
5 25 2.2361 7.0711 55 30 25 7.4162 23.4591 - 
6 36 2.4495 7.7460 56 31 36 7.4833 23.6643 
7 49 2.6458 8.3666 57 3249 7.5498 238747 
8 64 2.8284 8.9443 58 3364 7.6158 240832 
9 81 3.0000 9.4868 59 3481 7.6811 242899 
10 100 31623 10.0000 60 36 00 7.7460 24.4949 
11 121 3.3166 104881 бі 37 21 
|. 12 144 34641 10.9545 62 38 44 Dee ME 
13 169 3.6056 11.4018 63 3969 7.9373 25.0998 
14 196 3.7417 118322 64 40 96 80000 252982 
15 225 3.8740 12.2474 65 4225 8.0693 25.4951 
16 2 56 4.0000 12.6491 66 43 56 8.1240 25.6905 
17 289 41231 13.0384 67 44 89 81854 25.8844 - 
18 324 — 42426 13.4164 68 46 24 8.2462 26.0768 
19 3 61 4.3589 13.7840 69 4761 8.3066 262679 
20 400 44721: 141421 70 49 00 8.3666 26.4575 
21 441 45826 14.4914 71 5041 84206) 26.6458 
22 484 4.6904 14.8324 12 51 84 84853 26.8328. 
23 529 4.7958 15.1658 73 5329 8.5440 27.0185 ° 
24 576 4.8990 154919 74 54 76 86023 27,2029 
25 625 5.0000 15.8114 75 56 25 8.6603 27.3861 
26 676 5.0990 16.1245 16 5776 8.7178 27.5681 
27 729 5.1962 16.4317 77 5929 8.7750 27.7489 
28 784 5.2915 16.7332 . 18 6084 8.8318 27.9285 
29 841 5.3850 17.0294 т9 62 41 8.8889 28.1069. 
30 900 54772 17.3205 80 64 00 8.9443 28.2843 
31 961 5.5678 17.6068 81 65 61 9.0000 28,4605 
32 1024 5.6569 17.8885 82 6724 9.0554 
33 1089 5.7446 18.1659 83 6889 9.1104 
34 1156 5.8310 18.4391 84 70 56 9.1652 
35 1225 5.9161 18.7083 85 7225 9.2195 
36 1296 6.0000 18.9737 86 73 96 9.2736 29.3258 | 
37 1369 6.0828 19.2354 87 75 69 9.3274 294958 
38 1444 61644 19.4936 88 7744 9.3808 29.6648 | 
39 1521 6.2450 19.7484 89 7921 9.4840 29.8329 
40 1600 6.3246 20.0000 90 8100 9.4868 30.0000 - 
41 1681 6.4031 20.2485 91 82 81 9.5894 301662 . 
42 1764 64807 204939 92 84 64 9.5917 303315 . 
43 18 49 6.5574 20.7364 93 86 49 9.6437 30.4959 
44 1936 6.6332 20.9762 94 88 36 9.6954 30.0594 | 
45 2025 6.7082 21.2132 95 9025 9.7468 308221 — 
46 2116 6.7823 21.4476 96 9216 9.7980 30.9839 
47 2209 6.8557 21.6795 97 94 09 9.8489 31.1448 
48 2304 6.9282 21.9089 98 96 04 9.8995 31.3050 
49 2401 7.0000 22.1359 99 98 01 9.9499 31.4643 
100 10000 10.0000 31.6228 


50 25 00 7.0711 22.3607 


* With permission of the author and publisher abridged from J. P. Guilford, Fundamental statis 
chology and education, 4th ed., 1965, McGraw-Hill Book Company, New York. 


R 


APPENDIX TABLES 


(Continued) 

N № VN VION N № VN VION 
101 10201 10.0499 31.7805 151 22801 122882 388587 
102 10404 10.0995 31.9374 152 23104 123288 38.9872 
103 106 09 10.1489 32.0936 153 23409 12.3693 391152 
104 10816 10.1980 32.2490 154 23716 124097 392428 
105 14025 102470 324037 155 24025 12.4499 39.3700 
106 11236 10.2056 325516 156 24336 12.4900 39.4968 
107 11449 103441 32.7109 157 24649 12.5300 39.6232 
108 116 64 10.3923 32.8634 158 24964 12.5698 39,7492 
109 11881 104403 330151 159 25281 12.6095 39.8748 
M0 12100 104881 33.1662 160 2 56 00 12.6491 40.0000 
ПІ 12321 10.5357 333167 161 25921 126886 40.1948 
112 12544 105830 33.4664 162 26244 127979 40,2492 
13 12769 10.6301 33.6155 163 26569 12.7671 40.3733 
114, 139 96 10.6771 33.7639 164 26896 12.8062 404969 
115 13225 10.7288 · 33.9116 165 2 72 25 12.8452 40,6202 
116 13456 10.7703 34.0588 166 27556 12.8841 40,7431 
117 136 89 10.8167 34.2053 167 2 78 89 12.9228 40.8656 
118 13924 10.8628 343511 168 28224 129615 40.9878 
119 14161 10.9087 344964 169 28561 13.0000 411096 
120 14400 109545 346410 170 28900 130984 412311 
121 14641 11.0000 34.7851 171 29241 13.0767 413501 
122 148 84 110454 349285 12 29584 13149 414799 
123 15129 11.0905 35.0714 173 29929 131529 415933 
124 153 76 11.1355 35.2136 174 3 02 76 18.1909 41.7133 
125 156 25 11.1803 35.3553 175 30625 13.2288 4158330 
126 158 76 11.2250 35.4965 116 80976 182665 41 9524 
127 16129 11.2694 35,6371 1117 81829 18.3041 420714 
128 16384 113197 35.7771 l8 31684 133417 421900 
129 16641 113578 359166 119 32041 133791 423084 
130 16900 114018 36.0555 180 32400 134164 424264 
131 17161 11.4455 36.1939 181 32761 134536 42544 
132 17424 11.4891 363318 182 33124 134907 426615 
133 17689 115326 36.4692 155 88489 145077 49786 
134 17956 11.5758 36.6060 I94 S 83.56 — 1856 43.8052 
u : E " 11.6190 36.7423 155 34225 13.6015 48.0116 
1 1L6019 36.8782 186 3 45 9g А 

137 18769 117047 370135 187 34969 ine rt 
138 19044 11.7473 37,1484 188 35344 13.7113 43.3590 
139 19321 117898 372897 189 35721 181477 43494 
140 19600 11.8322 37.4166 180 o 6100). 18540 ^ 426800 
14] 19881 11.8743 37.5500 191 246481 18.8208 487085 
142 20164 119164 37.6829 192. 86864 188584 128178 
143 20449 119583 37.8153 18% :8 7249. 13:994 459918 
144 20736 12.0000 379473 T4 8/7085 15994 44044 
145 21025 120416 кеш: 195 88095 18.9642 441658 
146 21316 12.0830 ; 190 38416 140 

147 216 00 12.1244 38.3406 197 388 09 Moses picid 
148 21904 12.1655 384708: 38 89204 140712 444972 
E- Do. du 44.6094 
150 22500 122474 38.7298 200 4 


Table М 


(Continued) 


15.7797 


49.8999 


N № VN VION N № VN 
201 40401 14.1774 44.8330 251 630 
202 40804 14.2127 44.9444 252 6 85 и 15. 
203 41209 14.2478 45.0555 253 64009 15.9060 
204 41616 14.2829 45.1664 254 64516 159374 
205 42025 14.3178 45.2769 255 65025 15.9687 
206 42436 143527 45.3872 256 655 
207 42849 14.3875 45.4973 257 660 ps i no 
208 43264 14.4222 45.6070 258 66564 16.0624 
209 43681 14.4568 45.7165 259 67081 160935 
210 44100 14.4914 45.8258 260 67600 16.1245 
211 44521 14.5258 45.9347 261 68121 161555 
212 44944 14.5602 46.0435 262 68644 16.1864 
213 453 69 14.5945 46.1519 263 69169 16.2173 
214 45796 14.6287 46.2601 264 69696 16.2481 
215 46225 14.6629 46.3681 265 70225 16.2788 
216 4 66 56 14.6969 46.4758 266 70756 16.3095 
217 41089 14.7309 46.5833 267 7 12 89 16.3401 
218 4 75 24 14.7648 46.6905 268 71824 16.3707 
219 47961 14.7986 46.7974 269 72361 16.4012 
220 4 84 00 14.8324 46.9042 210 72900 16.4317 
221 48841 14.8661 47.0106 271 13441 16.4621 
222 49284 14.8997 47.1169 272 7 89 84 106.4924 
223 49729 14.9332 47.2229 273 7 45 29 16.5227 
224 50176 14.9666 — 47.3286 214 75076 ` 16.5529 
225 5 06 25 15.0000 47.4342 275 75625 16.5831 
226 51076 15.0333 47.5395 216 76176 16.6132 
997 51529 15.0665 47.6445 277 76729 16.6433 
298 51984 15.0997 47.7493 278 77284 16.6733 
229 52441 15.1327 47.8539 219 77841 16.7033 
230 52900 15.1658 47.9583 280 78400 16.7332 
231 53361 15.1987 48.0625 281 789 61 16.7631 
232 53824 15.2815 48.1664 282 79524 16.7929 
233 54289 15.2643 48.2701 283 80089 16.8226 
234 54156 15.2971 48.3735 284 80656 16.8523 
235 5 5225 15.3297 48.4768 285 81225 16.8819 
236 5 56 96 15.8623 48.5798 286 81796 16.9115 
297 56169 15.3948 48.6826 287 82369 16.9411 
238 5 66 44 154272 48.7852 288 82944 16.9706 
239 5 7121 15.4596 488876 289 83521 17.0000 
240 5 76 00 15.4919 48.9898 290 84100 17.0294 
241 58081 15.5242 49.0918 291 84681 17.0587 
242 5 85 64 15.5563 49.1935 292 8 52 64 17.0880 
243 59049 15.5885 49.2950 293 858 49 17.1172 
244 59536 15.6205 49.3964 294 8 64 36 17.1464 
245 60025 15.6525 49.4975 295 87025 17.1756 
246 60516 15.6844 49.5984 296 8 76 16 17.2047 
241 61009 15.7162 49.6991 297 88209 17.2337 
248 61504 15.7480 49.7996 298 8 17.2627 
8 17.2916 
9 


15.8114 


50.0000 


о Ф 
Sts 
ecc 
55 5 


17.8205 


472 


Table M 


APPENDIX TABLES 


(Continued) 
N № VN VION N № VN VION 
E 

301 90601 17.3494 54.8635 351 123201 18.7350 59.2453 
302 9 12 04 17.3781 54.9545 352 12 39 04 18.7617 59.3296 
303 9 18 09 17.4069 55.0454 353 12 46 09 18.7883 59.4138 
304 9 24 16 17.4356 55.1362 354 12 53 16 18.8149 59.4979 
305 9 30 25 17.4642 55.2268 355 12 60 25 18.8414 59.5819 
306 9 36 36 17.4929 55.3173 356 12 67 36 18.8680 59.6657 
307 9 42 49 17.5214 55.4076 357 12 74 49 18.8944 59.7495 
308 9 48 64 17.5499 55.4977 358 12 81 64 18.9209 59.8331 
309 9 54 81 17.5784 55.5878 359 12 88 81 18.9473 59.9166 
310 9 61 00 17.6068 55.6776 360 12 96 00 18.9737 60.0000 
311 9 67 21 17.6352 55.7674 361 13 03 21 19.0000 60.0833 
312 9 73 44 17.6635 55.8570 362 13 10 44 19.0263 60.1664 
313 9 79 69 17.6918 55.9464 363 13 17 69 19.0526 60.2495 
314 9 85 96 17.7200 56.0357 364 13 24 96 19.0788 60.3324 
315 9 92 25 17.7482 56.1249 365 13 32 25 19.1050 60.4152 
316 9 98 56 17.7764 56.2139 366 13 39 56 19.1311 60.4979 
317 1004 89 17.8045 56.3028 367 13 46 89 19.1572 60.5805 
318 1011 24 17.8326 56.3915 368 13 54 24 19.1833 60.6630 
319 101761 17.8606 56.4801 369 13 61 61 19.2094 60.7454 
320 1024 00 17.8885 56.5685 370 13 69 00 19.2354 60.8276 
321 103041 17.9165 56.6569 371 13 76 41 19.2614 60.9098 
322 10 36 84 17.9444 56.7450 372 13 83 84 19.2873 60.9918 
323 1043 29 17.9722 56.8331 373 13 91 29 19.3132 61.0737 
324 10 49 76 18.0000 56.9210 374 13 98 76 19.3391 61.1555 
325 1056 25 18.0278 57.0088 375 14 06 25 19.3649 61.2372 
326 10 62 76 18.0555 57.0964 376 14 13 76 19.3907 61.3188 
327 10 69 29 18.0831 57.1839 377 14 21 29 19.4165 61.4003 
328 107584 18.1108 57.2713 378 14 28 84 19.4422 61.4817 
329 108241 18.1384 57.3585 379 14 36 41 19.4679 61.5630 
330 10 89 00 18.1659 57.4456 380 14 44 00 19.4936 61.6441 
331 10 95 61 18.1934 57.5326 381 14 БІ 61 19.5192 61.7252 
332 11 02 24 18.2209 57.6194 382 14 59 24 19.5448 61.8061 
333 11 08 89 18.2483 57.7062 383 14 66 89 19.5704 61.8870 
334 111556 18.2757 57.7927 384 14 74 56 19.5959 61.9677 
335 112225 18.3030 57.8792 385 14 82 25 19.6214 62.0484 
336 112896 183303 579655 386 14 89 96 19.6469 62.1289 
387 11 35 69 18.3576 58.0517 887 14 97 69 19.6723 62.2093 
338 114244 18.3848 581378 388 15 05 44 19.6977 62.2896 
339 11 49 21 18.4120 58.2937 889 15 13 21 19.7231 62.3699 
340 115600 184391 58.3095 390 152100 19.7484 624500 
341 116281 18.4662 58.3952 391 15 28 81 19.7737 62.5300 
342 11 69 64 18.4932 58.4808 392 15 36 64 19.7990 62.6099 
343 11 76 49 18.5203 58.5662 393 15 44 49 19.8242 62.6897 
344 11 83 36 18.5472 58.6515 394 15 52 36 19.8494 62.7694 
345 11 90 25 18.5742 58.7367 395 15 60 25 19.8746 62.8490 
346 119716 18.6011 588218 396 15 6816 198997 62.9285 
347 12 04 09 18.6279 58.9967 397 157609 19.9249 63.0079 
348 121104 18.6548 58.9915 398 158404 199499 63.0872 
349 121801 18.6815 59.0762 399 15 92 9i 19.9750 63.1664 
350 122500 18.7083 59.1608 400 16 00 00 


20.0000 63.2456 


Table М 


APPENDIX TABLES 473 


(Continued) 

Я p - ` 

N N VN VION N № VN VION 
401 16 08 01 20.0250 63.3246 451 2034 
402 16 16 04 20.0499 63.4035 452 2043 A 213608 BS. 
403 16 24 09 20.0749 63.4823 453 205209 212838 673053 
404 163216 20.0998 63.5610 454 206116 213073 673195 
405 16 4025 20.1246 63.6396 455 20 7025 21.3307 674587 
406 16 48 36 20.1494 63.7181 456 207936 21.3542 67.5278 
407 165649 20.1742 63.7966 457 208849 213776 67.6018 
408 16 64 64 20.1990 63.8749 458 209764 214009 67.6757 
409 16 72 81 20.2237 63.9531 459 210681 214943 67.7495 - 
410 168100 20.2485 640312 460 2116 00 214476 67.8233 
411 168921 20.2731 64.1093 461 2125 21 214709 67.8970 
412 16 9744 202978 64.1872 462 213444 214942 67.9706 
413 17 05 69 20.3224 64.2651 463 214369 21.5174 68.0441 
414 1713 96 20.3470 64.3428 464 215296 21.5407 681175 
415 172225 20.3715 64.4205 465 216225 21.5639 68.1909 
416 173056 20.3961 64.4981 466 21 71 56 215870 68.2642 
417 173889 20.4206 64.5755 467 218089 21.6102 68.3874 
418 174724 20.4450 64.6529 468 21 9024 216333 684105 | 
419 175561 20.4695 64.7302 469 219961 21.6564 68.4836 
420 -17 64 00 20.4939 64.8074 470 22 09 00 21.6795 68.5565 | 
421 17 72 41 20.5183 64.8845 471 221841 21.7025 68,6294 - 
492 17 8084 20.5426 64.9615 472 222784 21.1256 68.7023 
423 17 8929 20.5670 65.0385 413 22 37 29 21.7486 — 68.7750 | 
424 17 97 76 20.5913 65.1153 474 22 46 76 21.7715 68.8477 | 
425 180625 20.6155 65.1920 475 225625 21.1945 68.9202 - 
496 18 14 76 20.6398 65.2687 476 226576 21.8174 68.9928. | 
497 182329 20.6640 65.3452 477 22 1529 218403 69.0652 - 
428 18 31 84 20.6882 654217 478 228484 21.8632 69.1375 
499 184041 20.7123 65.4981 479 229441 21.8861 69.2098 - 
430 184900 20.7364 65.5744 480 230400 21.9089 69.2820 - 
431 185761 20.7605 65.6506 481 231361 21.9817 693542 | 
432 18 66 24 20.7846 65.7267 482 232324 219545 69.4262 - 
433 18 74 89 20.8087 658027 483 23 32 89 21.9778 69.4982 - 
434 188356 20.8327 65.8787 484 23 42 56 22.0000 69.5701 | 
435 189225 20.8567 65.9545 485 235225 22.0227 69.6419 
436 190006 20.8806 66.0303 486 23 6196 22.0454 69.7137 
437 190969 20.9045 661060 487 23 71 69 22.0681 69,854 | 
438 191844 20.0284 66.1816 488 23 8144 22.0907 69.8070 | 
439 192721 20.9523 66.2571 489 239121 22.1133 . 
440 193600 20.9762 66.3325 490 240100 22.1359 70.0000 - 
441 194481 21.0000 66.4078 491 241081 22.1585 700714 | 
442 19 53 64 21.0238 66.4831 492 242064 221811 70.1427 — 
443 196249 21.0476 66.5582 493 243049 22.2036 702140 |, 
444 19 7136 21.0713 66.6333 494 244036 22.2261 10.2851 и 
445 198025 21.0950 66.7083 495 245025 22.2486 70.3562 
446 198916 21.1187 66.1832 496 246016 22.2711 70.4278 
441 199809 21.1424 66.8581 497 241009 22.2935 70.4982 - 
448 200704 21.1660 66.9328 498 24 8004 22.3159 70.5691 
449 201601 21.1896 67.0075 499 249001 22.3383 70.6399 

20 25 00 21.2132 67.0820 500 250000 22.3607 70.7107 


450 


476 


Table M (Continued) 


APPENDIX TABLES 


№: 


VN 


VION 


"———— Е S 


701 49 14 01 26.4764 83. 751 56 40 01 27.4044 
702 49 28 04 26.4953 83.7854 752 56 55 04 27.4226 
703 494209 26.5141 83.8451 753 56 70 09 27.4408 
704 49 56 16 26.5330 83.9047 754 56 85 16 27.4591 
705 49 70 25 26.5518 83.9643 755 57 00 25 27.4773 
706 49 84 36 26.5707 84.0238 756 57 15 36 27.4955 
707 49 98 49 26.5895 84.0833 757 57 3049 27.5136 
708 50 12 64 26.6083 84.1427 758 574564 27.5318 
709 50 26 81 26.6271 84.2021 759 57 6081 27.5500 
e 710 50 41 00 26.6458 84.2615 760 57 76 00 27.5681 
711 50 55 21 26.6646 84.3208 761 57 91 21 27.5862 
712 50 69 44 26.6833 84.3801 762 58 06 44 97.6043 
713 50 83 69 26.1021 84.4393 763 58 21 69 27.6225 
714 50 97 96 26:7208 84.4985 764 58 36 96 27.6405 
715 51 12 25 26.7395 84.5577 765 58 5225 27.6586 
716 51 26 56 26.7582 84.6168 766 58 67 566 27.6767 
717 51 40 89 26.7769 84.6759 767 58 82 89 27.6948 
718 51 55 24 26.7955 84.7349 768 58 98 24 27.7128 
719 51 69 61 26.8142 84.7939 769 59 13 61 27.7308 
720 51 84 00 26.8328 84.8528 770 59 29 00 27.7489 
721 51 98 41 26.8514 84.9117 771 59 44 41 27.1669 
i 722 52 12 84 26.8701 84.9706 172 59 59 84 — 27.7849 
723 522729 26.8887 85.0294 ШЕ] 59 75 29 27.8029 
" 724 52 41 76 26.9072 85.0882 774 59 90 76 27.8209 
725 52 56 25 26.9258 85.1469 775 60 06 25 27.8388 
726 52 70 76 26.9444 85.2056 776 60 21 76 27.8568 
727 52 85 29 26.9629 85.2643 777 60 37 29 27.8747 
728 52 99 84 26.9815 85.3229 718 60 52 84 27.8927 
729 53 14 41 27.0000 85.3815 779 60 68 41 27.9106 
730 53 29 00 27.0485 85.4400 780 60 84 00 27.9285 
731 53 43 61 27.0970 85.4985 781 60 99 61 27.9464 
732 535824 27.0555 85.5570 782 611524 27.9643 
733 53 72 89 27.0740 85.6154 783 61 30 89 27.9821 
784 53 87 56 27.0924 85.6738 784 61 46 56 28.0000 
735 54 02 25 27.1109 . 85.7321 785 61 62 25 28.0179 
736 541696 271992 85.7904 786 61 77 96 280357 
787 543169 271477 85.8487 787 61 93 69 28.0535 
738 544644 27.1662 85.9069 788 62 09 44 28.0713 
789 54 61 21 27.1846 85.9651 789 62 25 21 28.0891 
740 54 76 00 27.2029 86.0233 790 62 41 00 28.1069 
741 54 90 81 27.2213 86.0814 791 62 56 81 28.1247 
742 55 05 64 27.2397 86.1394 792 62 72 64 28.1425 
743 55 20 49 27.2580 86.1974 793 62 88 49 28.1603 
744 55 35 36 27.2764 86.2554 794 63 04 36 28.1780 
745 55 50 25 27.2947 86.3134 795 63 20 25 28.1957 
746 55 65 16 27.3130 86.3713 796 63 36 16 282135 
747 55 80 09 27.3313 86.4292 797 635209 28.2312 
748 55 95 04 27.3496 86.4870 798 63 68 04 28.2489 
749 56 10 01 27.3679 86.5448 799 63 84 01 28.2666 
750 56 25 00 27.3861 86.6025 800 64 00 00 28.2843 


86.6603 
86.7179 
86.7756 
86.8332 
86.8907 
86.9483 
87.0057 
87.0632 
87.1206 
87.1780 
87.2353 
87.2926 
87.3499 
87.4071 
87.4643 
87.5214 
87.5785 
87.6356 
87.6926 
87.7496 
87.8066 
87.8635 
87.9204 
87.9773 
88.0341 
88.0909 
88.1476 
88.2043 
88.2610 
88.3176 
88.3742 
88.4308 
88.4873 
88.5438 
88.6002 


88.6566 
88.7130 
88.7694 
88.8257 
88.8819 
88.9382 
88.9944 
89.0505 
89.1067 
89.1628 
89.2188 
89.2749 
89.3308 
89.3868 


89.4427 


Table M (Continued) 


APPENDIX TABLES 477 


92.5203 - 
92.5743 

92.6283 - 
92.6823 - 
92.7362 


92.7901 
92.8440 | 
92.8978. 
92.0516 | 
93.0054 


93.8616 | 
93.9149 
93.9681. 


94.7101 


N № УМ VION N № VN 
нн тн нн нал Қ 
801 641601 28.3019 89.4986 851, 724201 291719 
802 64 32 04 28.3106 89.5545 852 725904 29.1890 
803 64 48 09 28.3373 89.6103 853 72 76 09 29.2062 
804 64 64 16 28.3549 89.6660 854 729316 29.2233 
805 64 8025 28.3725 89.7218 855 78 1025 29.2404 
806 64 96 36 28.3901 89.7775 856 73 27 36 29.2575 
807 65 12 49 28.4077 89.8332 857 73 44 49 29.9746 
808 65 28 64 28.4253 89.8888 858 73 61 64 29.2916 
809 65 44 81 28.4429 89.9444 859 73 78 81 29.3087 
810 65 61 00 28.4605 90.0000 860 73 96 00 29.3258 
811 65 77 21 28.4781 90.0555 861 74 13 21 29.3428 
812 65 93 44 28.4956 90.1110 862 74 30 44 29.3598 
813 66 09 69 28.5132 90.1665 863 74 47 69 29.3769 
814 66 25 96 28.5307 90.2219 864 746496 29.3939 
815 66 42 25 28.5482 90.2774 865 74 82 25 29.4109 
816 66 58 566 28.5657 90.3327 866 74 99 56 29.4279 
817 66 74 89 28.5832 90.3881 867 75 16 89 29.4449 
818 66 91 24 28.6007 90.4434 868 75 34 24 29.4618 
819 67 07 61 28.6182 90.4986 869 75 51 61 29.4788 
820 67 24 00 28.6356 90.5539 870 75 69 00 29.4958 
821 67 40 41 28.6531 90.6091 871 75 86 41 29.5127 
822 67 56 84 28.6705 90.6642 872 76 03 84 29.5296 
823 67 73 29 28.6880 90.7193 873 76 21 29 29.5466 
824 67 89 76 28.7054 90.7744 874 76 38 76 29.5635 
825 - 68 06 25 28.7228 90.8295 875 76 56 25 · 29.5804 
826 68 22 76 28.7402 90.8845 876 76 73 76 29.5973 
827 68 39 29 28.7576 90.9395 877 176 91 29 29.6142 
828 68 55 84 28.7750 90.9945 878 7708 84 29.6311 
829 68 72 41 28.7924 91.0494 879 772641 29.6479 
830 68 89 00 28.8097 91.1043 880 774400 29.6648 
831 69 05 61 28.8271 91.1592 881 77 61 61 29.6816 
832 692224 28.8444 91.2140 882 777924 29.6985 
833 69 38 89 28.8617 91.2688 883 77 96 89 29.7153 
834 69 55 56 28.8791 91.3236 884 78 14 566 29.7321 
835 69 72 25 28.8964 91.3783 885 78 32 25 29.7489 
836 69 88 96 28.9137 91.4330 886 78 49 96 29.7658 
837 700569 28.9310 91.4877 887 78 67 69 29.7825 
838 70 22 44 28.9482 91.5423 888 78 85 44 29.7993 
839 70 39 21 28.9655 91.5969 889 79 03 21 29.8161 
840 70 56 00 28.9828 91.6515 890 79 2100 29.8329 
841 70 72 81 29.0000 91.7061 891 79 38 81 29.8496 
842 70 89 64 29.0172 91.7606 892 79 56 64 29.8664 
843 71 06 49 29.0345 91.8150 893 79 74 49 29.8831 
844 71 23 36 29.0517 91.8695 894 179 92 36 29.8998 
845 171 40 25 29.0689 91.9239 895 801025 29.9166 
846 715716 29.0861 91.9783 396 802816 29.9333 
ват 717409 29.1033 92.0326 897 80 46 09 29.9500 
848 ті 91 04 291204 92.0869 898 806404 29.9666 
849 720801 29.1376 92.1412 899 80 82 01 29.9833 
850 72 25 00 29.1548 92.1954 900 81 00 00 30.0000 


Table М 


тə 


APPENDIX TABLES 


(Continued) 

N № VN VION N № VN VION 
АИ. a 
901 811801 30.0167 94.9210 951 904401 30.8383 97.5192 
902 8136 04 30.0333 94.9737 952 90 63 04 30.8545 97.5705 
903 8154 09 30.0500 95.0263 953 90 82 09 30.8707 97.6217 
904 81 72 16 30.0666 95.0789 954 910116 30.8869 97.6729 
905 819025 30.0832 95.1315 955 912025 30.9031 97.7241 
906 82 08 36 30.0998 95.1840 956 913936 30.9192 97.7753 
907 8226 49 30.1164 95.2365 957 9158 49 30.9354 97.8264 
908 82 44 64 30.1330 95.2890 958 9177 64 30.9516 97.8775 
909 82 62 81 30.1496 95.3415 959 91 96 81 30.9677 97.9285 
910 828100 30.1662 . 95.3939 960 92 16 00 30.9839 97.9796 
911 829921 30.1828 95.4463 961 92 35 21 31.0000 98.0806 
912 83 17 44 30.1993 95.4987 962 92 54 44 31.0161 98.0816 
913 83 35 69 30.2159 95.5510 963 92 73 69 31.0322 98.1326 
914 83 53 96 30.2324 95.6033 964 92 92 96 31.0483 98.1835 
915 83 72 25 30.2490 95.6556 965 931225 31.0644 98.2344 
916 83 9056 30.2655 95.7079 966 93 31 566 31.0805 98.2853 
917 84 08 89 30.2820 95.7601 967 93 50 89 31.0966 98.3362 
918 84 27 24 30.2985. 95.8123 968 93 70 24 . 31.1127 98.3870 
919 84 45 61 30.3150 95.8645 969 93 89 61 31.1288 98.4378 
920 84 64 00 30.3315 95.9166 970 94 09 00 31.1448 98.4886 
921 84 82 41 30.3480 95.9687 971 94 28 41 31.1609 98.5393 
922 85 00 84 30.3645 96.0208 972 94 47 84 31.1769 98.5901 
923 85 19 29 30.3809 96.0729 973 94 67 29 31.1929 98.6408 
924 85 37 76 30.3974 96.1249 974 94 86 76 31.2090 98.6914 
925 85 56 25 30.4138 96.1769 975 95 06 25 31.2250 98.7421 
926 85 74 76 30.4302 96.2289 976 95 25 76 31.2410 98.7927 
927 85 93 29 30.4467 96.2808 977 95 45 29 31.2570 98.8433 
928 86 11 84 30.4631 96.3328 978 95 64 84 31.2730 98,8939 
929 86 3041 30.4795 96.3846 979 958141 31.2890 989444 
930 86 49 00 30.4959 96.4365 980 96 04 00 31.3050 98.9949 
931 86 67 61 30.5123 96.4883 981 96 23 61 31.3209 99.0454 
932 86 86 24 30.5287 96.5401 982 96 43 24 31.3369 99.0959 
933 87 04 89 30.5450 96.5919 983 96 62 89 31.3528 99.1464 
934 8723 56 30.5614 96.6437 984 96 82 56 31.3688 ° 99.1968 
935 87 42 25 30.5778 96.6954 985 97 02 25 31.3847 99.2472 
936 87 60 96 30.5941 96.7471 986 97 21 96 31.4006 99.2975 
937 87 79 69 306105 96.7988 987 974169 31.4166 99.3479 
938 87 98 44 30.6268 96.8504 988 97 61 44 31.4325 99.3982 
939 881721 30.6431 96.9020 989 978121 31.4484 994485 
940 88 36 00 30.6594 96.9536 990 98 0100 31.4643 99.4987 
941 88 54 81 30.6757 97.0052 991 98 20 81 31.4802 99.5490 
942 88 73 64 30.6920 97.0567 992 98 40 64 31.4960 99.5992 
943 88 92 49 30.7083 97.1082 993 98 60 49 31.5119 99.6494 
944 891186 30.7246 97.1597 994 98 8036 31.5278 99.6995 
945 893025 30.7409 97.2111 995 99 0025 31.5436 997497 
946 894916 30.7571 97.2625 996 '99 20 16 31.5595 99.7998 
947 89 68 09 30.7734 97.3139 997 99 40 09 315753 99.8499 
948 89 87 04 30.7896 97.3653 998 9960 04 31.5911 99.8999 
949 90 06 01 30.8058 97.4166 999 99 80 01 31.6070 99.9500 
950 90 25 00 30.8221 97.4679 1,000 1 00 00 00 31.6228 1000000 


VN 


REFERENCES 


Aitken, A. C., 1937: The Evaluation of a Certain Triple-product Matrix. Proceed- | 
ings of the Royal Society of Edinburgh, 57:172-181. D 
Aspen, Alice A., 1949: Tables for Use in Comparisons Whose Accuracy Involves 
Two Variances, Separately Estimated. Biometrika, 36:290-291. 

Bancroft, T. A., 1968: Topics in Intermediate Statistical Methods. Ames, lowa: d 
The Iowa State University Press. ^ 
Binder, A., 1955: The Choice of an Error Term in Analysis of Variance Designs. қ 
Psychometrika, 20:29-50. 
Bradley, James V., 1968: Distribution-free Statistical Tests. Englewood Cliffs, 
N.J.: Prentice-Hall, Inc. ^ 
Carroll, John B., 1953: An Analytical Solution for Approximating Simple Struc- | 
ture in Factor Analysis. Psychometrika, 18:23-38. Eo 
Cochran, William G., and Gertrude M. Cox, 1960: Experimental Designs. New 
York: John Wiley & Sons, Inc. 
L. J. (ed.), 1965: Barlow's Tables. London: Science Paperbacks, E. and | 


Comrie, 
F. N. Spon. 


Cornell, Francis G., 1956: The Essentials of Educational Statistics. New York: 
John Wiley & Sons, Inc. 


Cronbach, L. J., 1957: The Two Disciplines of Scientific Psychology. The Ате E 
can Psychologist, 12:671-684. \ 
Dubois, Phillip H., 1965: An Introduction to Psychological Statistics. New Yor! c 
Harper and Row, Publishers. 

қ 
‚1970: Varieties of Psychological Test Homogeneity. The American Руси 
gist, 25:532-536. 
D. В., 1955: Multiple Range and Multiple F-tests. Biometrics, 11: 1-42) 


Duncan, 
‚ 1957: Multiple Range Tests for Correlated and Heteroscedastic Means, 
Biometrics, 13:164—116. 

Edwards, Allen L., 1967: Statistical Methods, 2d ed. New York: Holt, Rineha: 
and Winston, Inc. 

‚ 1968: Experimental Design in Psychological Research, 3d ed. New York: 
Holt, Rinehait and Winston, Inc. 
1: The Reliability of Mental Tests. London: University | 


Ferguson, George A., 194 
of London Press, Ltd. 


, 1951: A Note on the Kude 
logical Measurement, 11: 612-615. 
1954: The Concept of Parsimony in Factor Analysis. Ps 
19:281-290. 

1965: Nonparametric Tren 


r-Richardson Formula. Educational and Psycho- 


ychometrika ha 


d Analysis. Montreal: McGill University Press, 4 


к=з 


480 


REFERENCES 


Finney, D. J., 1944: The Application of Probit Analysis to the Results of Mental 
Tests. Psychometrika, 9:31-39. 


„ 1947: Probit Analysis. New York: Cambridge University Press, 


, 1948: The Fisher-Yates Test of Significance in 2 x 2 Contingency Tables. 
Biometrika, 35:145-156. 


+ 1960: The Theory of Experimental Design. Chicago: University of 
Chicago Press. 


Fisher, R. A., 1948: Statistical Methods for Research Workers, 10th ed. Edin- 
burgh: Oliver & Boyd, Ltd. 


and Е. Yates, 1963: Statistical Tables for Biological, Agricultural, and 
Medical Research, 4th ed. Edinburgh: Oliver & Boyd, Ltd. 


Freund, John E 


+, 1962: Mathematical Statistics. Englewood Cliffs, N.J.: 
Prentice-Hall, Inc. 


+ 1967: Modern Elementary Statistics, 3d ed. Englewood Cliffs, N.J.: 
Prentice-Hall, Inc. 

Friedman, M., 1937: The Use of Ranks to 
Implict in the Analysis of Variance 
tion, 32:675-701. 


Avoid the Assumption of Normality 
е. Journal of the American Statistical Associa- 


» 1940: A Comparison of Alternative Tests of Significance: 
m Rankings. Annals of Mathematical Statistics, 11:86-92, 


Gourlay, Neil, 1955: F-test Bias for Experimental Designs in Educational 
Research. Psychometrika, 20:227-248. 


Gronow, D. С. С., 1951: Test for the Significance of Differences between Means 
in Two Normal Populations Having Unequal Variances. Biometrika, 38:252-256. 
Guilford, J. Р:, 1965: Fundamental Statistics in Psychology and Educaion, 4th 
ed. New York: McGraw-Hill Book Company, 
Gulliksen, H., 1950: Theory of Mental Tests, New York: 


Harman, Harry H., 1967: 
Chicago Press, 


for the Problem of 


John Wiley & Sons, Inc. 


Modern Factor Analysis. Chicago: University of 


Hersberg, Paul A., 1969: The P, 
Monograph Supplement, no. 16, 


Holzinger, Karl J., and Frances Swineford, 1937: The Bi-factor Method. 
Psychometrika, 2:41-54. 5 


arameters of Cross-validation, Psychometrika 


Jackson, В. W. B., and George A. Ferguson, 
Tests. Bulletin 12, University of Toronto, ‘Depari 
Toronto. 


1941: Studies on the Reliability of 
tment of Educational Research, 


and „ 1942: Manual of Educational Statistics, University of Toronto, 
Department of Educational Research, Toronto. 


Johnson, Palmer O., 1949: Statistical Methods in Research, Englewood Cliffs, 
N.J.: Prentice-Hall, Inc. 


Kaiser, Henry F., 1958: The Varimax Criterion for 


Analytic Rotation in Factor 
Analysis. Psychometrika, 23:187-200. 


D = - № 


REFERENCES 481 


‚ 1960: Directional Statistical Decisions. Psychological Review, 67:160-167. 
Keeping, Е. S., 1962: Introduction to Statistical Inference. Princeton, N.J.:D. Van 
Nostrand Company, Inc. } 
Kendall, M. G., 1951: The Advanced Theory of Statistics. vol. 2, 3d ed. London: 
Charles Griffin and Company, Ltd. 

‚ 1952: The Advanced Theory of Statistics. vol. 1, 5th ed. London: Charles | 
Griffin and Company, Ltd. 3 
, 1955: Rank Correlation Methods, 24 ed. London: Charles Griffin and 
Company, Ltd. 

" 1968: The Advanced Theory of Statistics, vol, 3. 2d ed. London: Charles 
Griffin and Company, Ltd. i 
Kenney, John F., and E. S. Keeping, 1951: Mathematics of Statistics, part 2, 
2d ed. Princeton, N.J.: D. Van Nostrand Company, Inc. Ч 


апа ‚ 1954: Mathematics of Statistics, part 1, 3d ed. Princeton, N.J.: 
D. Van Nostrand Company, Inc. 

Keuls, M., 1952: The Use of the Studentized Range in Connection with the Analy- 
sis of Variance. Euphytica, 1:112-122. Ж. 


Kruskal, W. H., and W. А. Wallis, 1952: Use of Ranks in One-criterion Variance | 


Analysis. Journal of the American Statistical Association, 41:583-621. D 


Kuder, G. F., and M. W. Richardson, 1937: The Theory and Estimation of Test j 
Reliability. Psychometrika 2:151-160. | 
Lacey, John 1., 1956: The Evaluation of Autonomic Responses: Towards a | 
General Solution, Annals of the New York Academy of Sciences, 67:123-164. 
Lawley, D. N., 1940: The Estimation of Factor Loadings by the Method of 
Maximum Likelihood. Proceedings of the Royal Society of Edinburgh, 60:44-82. — 


and А. Е. Maxwell, 1963: Factor Analysis as a Statistical Method. `. 


London: Butterworth and Company, Ltd. А 
Lord, Frederic М., 1955а: Estimating Test Reliability. Educational and Psy- 
chological Measurement, 15:325-336. 
‚ 1955Ь: Sampling Fluctuations Resulting from the Sampling of Test Items. | 
Psychometrika, 20:1-22. ў 
‚ 1957: Do Tests of the Same Length Have the Same Standard Errors of | 


Measurement? Educational and Psychological Measurement, 17:510-521. b 


and Melvin R. Novick, 1968: Statistical Theories of Mental Test Scores. 
Reading, Mass.: Addison-Wesley Publishing Company. 

Maemeeken, À. M., 1940: The Intelligence of a Representative Group of Scottish — 
Children. London: University of London Press, Ltd. ^3 
Magnusson, David, 1967: Test Theory. Reading, Mass.: 
lishing Company. 

Mann, H. B., and D. R. 
Random Variables is Stochastically 
ical Statistics, 18:50-60. 


Addison-Wesley Pub- * 


Whitney, 1947: On a Test of Whether One of Two 
Larger Than the Other. Annals of Mathemat- 


„Biometrika, 40:87-104, 


REFERENCES 


McNemar, Quinn, 1947: Note on the Sampling Error of the Differences between 
Correlated Proportions or Percentages. Psychometrika, 12:153—157. 


+ 1969: Psychological Statistics, 4th ed. New York: John Wiley & Sons, 
Meredith, William M., 1967: Basic Mathematical and Statistic 
chology and Education. New York: McGraw-Hill Book Company. 

Nair, K. R., 1940: Tables of Confidence Intervals for the Median in Samples from 
Any Continuous Population. Sankhya, 4:551-558. (Not seen.) 


Neuhaus, Jack O., and Charles Wrigley, 
alytical Approach to Orthogonal Simple Stri 
Psychology, 7:81-91. 


Inc. 
al Tables for Psy- 


1954: The Quartimax Method: an An- 
ucture. British Journal of Statistical 


Newman, D., 1939: The Distribution of the Ran 
ulation in Terms of an Independent Estimate 


31:20-30. 


Saunders, D. R., 1953: An Analytic Method for Rotation to Orth 


Structure. Research Bulletin, RB 53-10. Princeton, 
Service. 


Scheffé, H., 1953: A Meth 


ве in Samples from a Normal Pop- 
of Standard deviation, Biometrika, 


hogonal Simple 
N.J.: Educational Testing 


od for Judging All Contrasts in the Analysis of Variance, 


+ 1959: The Analysis of Variance. New York: 
Siegel, Sidney, 
Company. 


John Wiley & Sons, Inc. 
1956: Nonparametric Statistics. New York: McGraw-Hill Book 


Snedecor, George W., and William С. Cochran, 
ed. Ames, Iowa: The Towa State University Press, 
Stevens, S. $ 
64:153-181. 


1967: Statistical Methods, 6th 
+» 1957: On the Psychophysical Law. Psychological Review, 


Thomson, Godfrey H., 1951: The Factorial An. 
London: University of London Press, Ltd. 
Tharstone, L. L., 
Chicago Press, 


alysis of Human Ability, 5th ed. 


1944: А Factorial Study of Perception. Chicago: University of. 


Torgerson, Warren S., 1958: Theory and Methods of Scaling. New York: John 
Wiley & Sons, Inc, 

ution of the Analysis of Variance and Covariance in 
Disproportionate Numbers of Observations in the Sub- 
classes. Psychometrika, 11:107-128. E 

Tukey, John W., 1949: Comparin 


£ Individual Means in the Analysis of Variance, 
Biometrics, 5:99-114, 


Walker, Helen M., and Joseph Lev, 
Holt, Rinehart and Winston, Inc. 


Welch, B. L., 1938: The Significance of 
When the Population Variances are Unequ 


1953: Statistical. Inference. New York: 


the Differences between Two Means 
al. Biometrika, 29:350-362. 

. 1947: The Generalization of Student's Problem Wh 
Population Variances Are Involved. Biometrika, 34:28-35, 


en Several Different 


Wilk, М. B., and О. Kempthorne, 1955: Fixed, Mixed, and Random Мо 
Journal of the American Statistical Association, 50:1144—1167. ( 


Winer, В. J., 1962: Statistical Principles іп Experimental Design. New York: 
McGraw-Hill Book Company. 


Woo, T. L., 1928: Dextrality and Sinistrality of Hand and Eye, 2d m ir 
Biometrika, 20A:79-148. 


Absolute zero, 14 
Age allowances, 388—389 
Aitken, A. С., 398, 479 
Aitken's numerical solution, 398— 
401 
Alternative hypothesis, 148, 150— 
151 
Analysis of covariance, 288-300 
adjusting sum of squares, 292— 
293 
computation, 294-295 
degrees of freedom, 293 
extended use of, 299 
notation, 289-290 
partitioning a sum of products, 
290 , 
regression lines in, 291—292 
in testing homogeneity of 
regression coefficients, 298— 
299 
variance estimates, 293—294 
Analysis of variance, 208-300 
assumptions underlying, 219- 
220 
choice of error term, 232 
256-257 
classification: one-way, 208-221 
three-way, 246-267 
two-way, 223-243 
comparison of means following 
an F test, 268-275 


2, 


Analysis of variance: 
computation: one-way classifi- 
cation, 214-218 
three-way classification, 257- 
263 
two-way classification, 234— 
238 
unequal numbers in sub- 
classes, 238-241, 263— 
264 
covariance method (see 
Analysis of covariance) 
degrees of freedom, 212-213, 
226-228, 251 
F ratio in, 214, 218, 232-233, 
256-251, 210-271, 280, 
285-286 
interaction, nature of, 228-229, 
251-254 
mean square (see variance 
estimate below) 
models, finite, random, fixed, and 
mixed, 229-232, 254-257 
multiple comparisons, 269-275 
notation, 210—211, 224-225, 
241—249 
null hypothesis in, 214, 230 
by ranks: correlated samples, 
333-335 
independent samples, 331- 
333 


Analysis of variance: 
repeated measurements, 241— 
243, 264-265 
sum of squares: between 
groups, 211-212 
within groups, 211-212 
for interaction, 225—226, 
228-229, 251-254 
partitioning, 211-212, 225- 
226, 249-250 d 
pooling, 233-234 
for two groups, 218-219 
with unequal numbers in sub- 
classes, 238-241, 263-264, 
285 1 
variance estimate: expectation 
213-214, 229-232, 254-256 
meaning, 213-214, 229-232, | 
254-256 
one-way classification, 212- 
214 
three-way classification, 254— 
257 
two-way classification, 226- 
2r 
Analytical methods of rotation in 
factor analysis, 421-425 
Arc sine transformation, 220 
Arithmetic mean (see Mean, 
arithmetic) 
Aspen, Alice A., 157, 479 


4 ptotic relative efficiency, 322 

rank test for k correlated 

4 samples, 335 

rank test for k independent 

p samples, 333 

for two correlated samples, 
331 

for two independent samples, 
329 

— sign test for two correlated 
samples, 325 

sign test for two independent 
samples, 324 

Attenuation, 370-371 

Average, 44-45 

- (See also Mean; Median) 


Biased estimate, 59 

imodal distribution, 39 

inder, A., 234, 479 

inet, Alfred, 5 

inomial distribution, 37, 70, 78- 
_ 84 

- goodness of fit, 177-178 

апа hypothesis testing, 83-84 
kurtosis of, 81-82 

limiting form, 86-87 

ean of, 81-82 s 

lated to normal curve, 86-87 
ewness, 81-82 

у variance, 81-82 

Biometrika, 8 

iserial correlation, 358-359 
Bivariate distribution, 103-104 
iss, C. I., 221 

radley, James V., 323, 335, 342 
Burt, Cyril, 405 


arrol, John B., 421, 479 
Centroid solution, 416 
Chi square ( x2), 173-192 
applied in analysis of inee 
by ranks, 332 
applied in contingency tables. 
2. 


es, 335 
Гог К independent samples, 


332-333 


Chi square (x2) : 
applied in sign test: for k in- 
dependent samples, 326 
for two correlated samples, 
324 
for two independent samples, 
323-324 
computation, combining fre- 
quencies in, 178 
correction for continuity, 
188-189 
critical values, table, 451 
defined, 174 
degrees of freedom, 175-176, 
180, 184, 189—191, 324, 326, 
332, 335 
distribution, 175-177 
formulas for, 174, 185, 188, 190 
for fourfold table, 185—186, 
188—189 
one- and two-tailed tests, 
189-190 
related to normal deviate, 186, 
325 
phi coefficient, 349 
sample size, 190 
sampling distribution, 175-177 
small expected frequencies, 
188-189 
in test: of coefficient of con- 
cordance, 315 
of coefficient of consistence. 
318 
of difference between pro- 
portions, 186-188 
of goodness of fit, 177-182 
of independence, 182-186. 
of unequal and dispropor- 
tionate frequencies, 
238-239 
Class houndaries, 30 
Class interval, 27-31 
conventions regarding, 27-28 
defined, 27 
distribution of observations 
within, 29-31 
exact limits, 29 
mid-point, 31 
‘lassification variables, 198, 
205-206 
Clemm, Donald S., 468n. 
Cochran, W. G., 155, 157, 201, 
206, 221, 479, 482 
Coefficient: 
of concordance, 312-314 


formula for, 313 


Coefficient: 
of concordance: related to rho, 
314 
significance, 315 
with tied ranks, 314 
of consistence, 315-318 
formula for, 317 
significance, 318 
of orthogonal polynomials, 
Г table of, 463 
(See also Correlation 
coefficient; Phi co- 
efficient; Reliability ' 
coefficient) 
Combinations, 77-78 
Communality, 410, 414—415 
Complete factorial experiment, 
202 
Comrie, L. J., 479 
Concordance (see Coefficient) 
Confidence interval, 136, 138-139, 
143-145 
for correlation coefficient, 169 
for means: of large samples, 
138-139 
of small samples, 143 
for median, 144 
for proportion, 143-144 
for standard deviation, 144-145 
Consistence (see Coefficient, of 
consistence) 
Consistent estimate, 136-138 
Constant, defined, 11 
Constant process, 5 
Contingency table, 182-183 
Cornell, Francis G., 176n., 479 
Correction: 
for attentuation, 370-371 
for continuity, 188-189, 310, 
325, 327-328, 337-338 
Correlation, 96-106, 348-351, 
356-359, 390-403 
measures of: biserial, 358-359 
concordance, 312-314 
Kendall's tau, 308-312 
multiple (see Multiple 
correlation) 
partial, 390-392 
phi coefficient, 348-351 
point biserial, 356-358 
product-moment (see 
Product-moment 
correlation) 
rank (sce Rank correlation) 
Spearman's rho, 305-308 


INDEX 


Correlation: 
and prediction, 96, 107—118, 
392-403 
and regression, 96-97, 392-401 
of sums, 392-394 
t ratio for, 169-172, 358 
between true scores, 370-371 
variance interpretation, 
115-117 
Correlation coefficient: 
confidence interval for, 169 
critical values, table, 457 
effect of measurement error on, 
370-371 
for multiple correlation, 395, 
397 
sampling distribution (see 
Sampling distribution) 
significance of, 169-170 
significance of difference: 
between correlated 
samples, 171-172 
between independent 
samples, 170-171 
standard error, 168—169 
transformation to z,, 168-171, 
456 
Covariance, 105 
Covariance analysis (see Analysis 
of covariance) 
Cox, D. R., 206, 479 
Cox, С. M., 155, 157, 201, 206, 
479 
Cronbach, L. J., 17, 479 
Cross-validation, 402 
Cumulative distribution, 31 


Darwin, Charles, 8-9 
Decile point, 378 
Degrees of freedom: 
in analysis of covariance (see 
Analysis of covariance) 
in analysis of variance (see 
Analysis of variance) 
for chi square, 176, 180, 184— 
185, 189-191, 332, 335 
for contingency tables. 184—185 
{ог F, 165-166, 214, 218-219, 
232-233, 256-251, 270-271, 
294 
geometric interpretation, 142- 
143 
meaning, 60, 142-143 
in multiple comparisons, 270, 
273 


Degree of freedom: 
for t, 140-141, 153-154, 156- 
151, 167, 169—171, 308, 392 
Delta scores, 387 
Descriptive statistics, 10 
Design of experiments, 195-300 
complete factorial experiments, 
202 
factorial experiments, 202—204 
Latin square, 205 
randomization, 201 
randomized block, 204—205 
single-factor experiments, 
200 
terminology, 198-199 
Difference scores, 371 
Directional test, 150-151 
Disarray, measures of, 304—305 
Distribution: 
bimodal, 38 
binomial (see Binomial 
distribution) 
bivariate, 103, 111-112 
chi square, 175-177 
cumulative, 31 
F, 165-166 
frequency, 25-42 
graphic representation, 32- 
37 
J-shaped, 38-39 
normal, 87-94 
properties of, 37-42 
rank, 26 
rectangular, 38-39 
sampling (see Sampling 
distribution) 
skewed, 37-42 
t, 140-142 
U-shaped, 38-39 
Distribution-free tests (see Non- 
parametric tests) 
Dubois, Philip H., 354, 398, 
479 
Duncan, David B., 268, 273-274, 
479 
Duncan method of multiple com- 
parisons, 273-274 


Edwards, Allen L., 274, 456n., 
479 

Efficiency of estimate, 136-137, 
322 


Error: 
of estimate, 114-115 


Error: 
of measurement, 362-374 
effect on correlation co- 
efficient, 370-371 
effect on mean, 364 < 
of mean, 369-370 
effect on variance, 364 
random, 362 
standard deviation, 371-373 _ 
systematic, 362 
sampling, 123-124 
Type I, 148-149 / 
Type II, 148-149 
Estimate, 120, 136-138 
consistent, 136-137 
efficient, 136-138 
error of, 114-115 
interval, 136 
meaning, 11, 136-138 
point, 136 
relative efficiency of, 137-1 
322 
sufficient, 136, 138 
unbiased, 136-137 З 
Estimation, 136-145 
Exact test of significance for 
fourfold table, 340-342 ў 
Expected value, 137, 213-214, 
229-232, 254-257 
Experimental design (see Design 
of experiments) 


Е ratio, 165-166 > 


294, 298 
bias in, 240-241 к. 
critical values, table, 452-455 _ 
related to t, 218-219 1 
in test: of difference between 


variances, 164—166 4 
of homogeneity of regression 
298-299 ? 
of multiple comparisons, 
270-271 


of multiple correlation 
coefficient, 401 
Factor analysis, 404—425 
analytical methods of rotation 
421-425 
basic equations, 407—409 


centroid solution, 416 


488 
Factor analysis: 
communality, 410, 414—415 
components of variance, 
409—410 
derived solutions, 414, 421-425 
direct solutions, 414—419 
geometry of, 411—414 
А maximum-likelihood solution, 
} 416 
‚ oblique solution, 409 
|. orthogonal solution, 409 
principal-factor solution, 
415-419 
residual correlations, 411 
222 rotational methods, 421-425 
structure, 420-42] 
varimax method of rotation, 
423-425 
Factorial experiments, 202-204 
Fechner, Gustav, 4 
Ferguson, George A., 26n., 104n., 
A 108n., 129n., 340, 368, 373, 
421, 479-480 
Finney, D. J., 5, 206, 342, 480 
Fisher, R. A., 9, 17, 56, 121, 168, 
201, 205, 208, 281, 340, 450n., 
Д 451n., 457n., 480 
© Fisher's z, transformation, 168— 
im 
Fitting of line, 108-111 
.. Fourfold point correlation (see 
Phi coefficient) 
|. Frequency, 26 
y comparison (see Chi square) 
— distribution (see Distribution) 
` observed, 174 
polygon, 34-35 
"А cumulative, 35-36 
theoretical, 174 
Freund, John E., 145, 480 
` Friedman, M., 315, 333, 480 
two-way analysis of variance 
by ranks, 333-335, 337 
Function, meaning of, 86-87 


Galton, Francis, 8, 96-97 
Geometric mean, 44 
Goodness of fit, 177-182 
Gosset, W. S., 141 

Gourlay, Neil, 240-241, 480 
Graphs, 32-37 

Gronow, D. С. C., 157, 480 
Guilford, J. P., 469n., 480 
Gulliksen, H., 347, 366, 480 
Guthrie, Eugene H., 468n. 


H test, one-way analysis of vari- 
ance by ranks, 331-333 
Harman, Harry H., 405, 414, 416, 
417п., 418n., 419n., 480 
Harmonic mean, 44 
Harter, H. Leon, 468n. 
Hersberg, Раш A., 402, 480 
Histogram, 33—34 
Holzinger, Karl J., 405, 416, 480 
Homogeneity: 
of regression coefficients, 
298-299 
of variance, 153, 164-165, 219- 
221 
Homoscedasticity (see Homo- 
geneity, of variance) 
Hotelling, Harold, 415 
Hypothesis: 
alternative, 148, 150-151 
null, 147-148 
Hypothesis testing (see 
Significance) 


Independence tests, 182—186 
Inference, statistical, 4, 9-10, 120 
Integers, first N: in non- 
"parametric tests, 326-340 
in rank correlation, 304-312 
standard deviation, 63—64 
sum, 21, 304 
of squares, 63, 304 
Interaction, 228-229, 251-254 
Interval: 
estimate, 136 
+ grouping (see Class interval) 
variable, 13-14 
Item selection, 354-361 


J-shaped distribution, 39 

Jackson, В. W. B., 26n., 104n., 
108n., 129n., 373, 480 

Johnson, Palmer O., 144-145, 
177, 178n., 480 


Kaiser, Henry F., 150-151, 421, 
423, 480 
Katti, S. K., 460n. 
Keeping, E. S., 32n., 144, 481 
Kempthorne, O., 230, 483 
Kendall, М. G., 281, 308, 310- 
312, 315, 317-318, 459n., 481 
Kendall's coefficient: 
of concordance, 312-315 
of consistence, 315-318 


INDEX 


Kendall's tau, 308-312 
significance, table for testing. 
459 

Kenney, John F., 32n., 144, 481 

Keuls, M., 268, 272-275, 481 

Kruskal, W. H., 331, 481 

Kruskal-Wallis one-way analysis 
of variance by ranks, 331- 
333, 336 

Kuder, G. F., 367, 481 

Kuder-Richardson formulas, 367- 
368, 373 

Kurtosis, 37, 42, 61-68 


Lacey, John I., 481 

Large sample statistics, 139, 141— 
142 

Latin square, 205 

Lawley, D. N., 416, 481 

Least-square method, 49, 108— 
1n 

Leptokurtic distribution, 37 

Lev, Joseph, 482 

Lewis, D., 140n. 

Linear regression, 107-118, 277— 
280 

Loevinger, Jane, 354 

Logarithmic transformation, 220 


MacMeeken, А. M., 112n., 481 
MeNemar, Quinn, 163, 178-179, 
181n., 200, 482 
Magnusson, David, 347, 373, 481 
Mann, Н. B., 326, 329, 481 
Mann-Whitney U test, 326-327 
Maximum likelihood solution in 
factor analysis, 416 
Maxwell, A. E., 416, 481 
Mean: 
arithmetic, 44-55 
of combined groups, 47 
confidence intervals of, 
138-139, 143 
defined, 45-46 
formulas for, 45—46 
properties, 47—49 
related to median and mode, 
52-53 
sampling distribution, 
126-131 
of test item, 347-348 
geometric, 44 
harmonic, 44 
Mean deviation, 57-58 


INDEX 


Mean square (see Analysis of 
variance, variance estimate) 
Median, 49-51 
confidence interval for, 144 
standard error of, 144 
Mendel, Abbé, 177 
Meredith, W. M., 17, 482 
Mesokurtic distribution, 37 
Mode, 51-52 
Moments, 67 
Monotonic functions, 336 
Monotonic trend analysis, 
335-340 
Müller, С. E., 5 
Multiple comparisons, 268-275 
Duncan method, 273-274 
Newman-Keuls method, 
272-275 
Scheffé method, 270-271 
Tukey method, 274 
using F test, 270-271 
using studentized range, 
271-273 
using £ test, 269-270 
Multiple correlation, 392—403 
Aitken's numerical solution, 
398-401 
coefficient, 395, 397, 401—403, 
45 
geometry of multiple regres- 
sion, 396-397 
interpretation, 401—403 
with more than three variables, 
397-401 
regression equations, 395-397 
sampling error, 401 
shrinkage, 401-402 
with three variables, 392-395 
Multiple regression, 392—401 


Nair, К. R., 140, 482 
Neuhaus, J. О., 421, 482 
Newman, D., 268, 272-275, 482 
Newman-Keuls method of multi- 
ple comparisons, 272-275 
Nondirectional test, 150-151 
Nonlinear regression, 117-118, 
280—286 
Nonlinearity test, 280-286 
Nonparametric tests, 15, 157, 
321-343 
Mann-Whitney U, 326-329 
monotonic trend for correlated 
samples, 337-340 


Nonparametric tests: 
monotonic trend for inde- 
pendent samples, 335-337 
rank, 326-340 
for Ё correlated samples, 
333-335 
Гог k independent samples, 
331-333 
for two correlated samples, 
329-331, 460-462 
tables for, 460-462 
for two independent samples, 
326-329 
tables for, 464-467 
sign: for k independent 
samples, 331-333 
for two correlated samples, 
324-325 
for two independent samples, 
323-324 
significance, 321—343 
Normal distribution curve, 86-95 
as approximation to binomial, 
92-93 
area under, 89-91, 448—449 
formula for, 88—89 
ordinates, 89, 448-449 
standard score form, 88-89 
summary of properties, 93-94 
table of ordinates and areas, 
448-449 
transformation to, 383—386 
Normal transformation, 383-386 
Norms, 375-389 
Novick, M. R., 347, 373, 481 
Null hypothesis: 
in analysis of variance, 214, 
230 
meaning, 147-148 


Olds, E. G., 458n. 
One- and two-tailed tests, 
150-151 
Ordinates of normal curve, 88—89 
table of, 448-449 
Orthogonal comparisons, 281 
Orthogonal polynomials, 278, 
280-281 
table of coefficients, 463 


Paired-comparisons method, 
315-318 

Parallel forms method, 366 

Parameter, 11, 120 

Partial correlation, 390-392 


Pascal's triangle, 81 
Pearson, Karl, 8, 97 
~ Percentiles, 377-383 
Permutations, 77-78 
Phi coefficient, 348-351 
effect of marginal totals on, 
350-351 К 
related to chi square, 349 
standard error, 351 
Pivotal condensation, 398—101 
Platykurtic distribution, 37, 41 — 
Point biserial correlation, 

356-358 y 
Point estimate, 136 3 
Polygon, 34-36 
Polynomial, 280-281 
Population: E 

defined, 6, 120 
finite, 7, 126-128 
infinite, 7, 128—131 
numerical properties, 7 Е 
Prediction, 96, 107-118 
errors, 114—115 
meaning, 107-108 ‘ 
in relation to correlation, 96, 
113-114, 392—403 4 
Principal-factor solution, 415-419. 
Probability, 70-85 í 
addition theorem, 75-76 jT 
and binomial, 78-81 
conditional, 74—75 
distributions, 76 . 
exact, 340-342 
joint, 74-75 = 
multiplication theorem, 75-76 
nature, 71-73 : 
Probits, method of, 5 : 
Product-moment correlation, 
96-119 Ы 
assumptions underlying, 
117-119 "AN 
computation, 101-103 
critical values, 457 
definition, 99-100 
related to regression, 113-114 
related to Spearman's rho, 306 
sampling error, 167-169 
variance interpretation, 
115-117 
Proportion: 
confidence interval, 143-144 
significance of difference: cor- 
related samples, 162—164 
independent samples, 
160-162, 186-187 
standard error, 143-144 


490 
E psschological test statistics, 


E 347-361 
_ Psychophysics, 4-5 


> 

%, Random, meaning, 121 
. Random numbers, 201 

"Randomization, 201-202 

_ Randomized block experiment, 
- . 204-205 

Randomly parallel tests, 372 


Kendall's tau, 308—312 
significance, 309-311 
with tied rapks, 309 
Spearman's rho, 305-308 
significance, 307-308 
with tied ranks, 306-307 
ank order statistics, 303—320 
Rank tests of significance, 
4 326-340 
"Reciprocal transformation, 220 
. Rectangular distribution, 39 
gression: 
in analysis of covariance, 291 
298-299 
bivariate distribution, 111-113 
_ equation, 108-114, 395-396, 
; 400-401 
à homogeneity of, 298-299 
linear, 107-115, 277-280, 
395-397 
meaning, 107-114 
multiple, 392-403 
nonlinear, 117-118, 280-285 
. related to correlation, 113—114 
transformations, 386-388 


asymptotic, 322 
Reliability (see Error, of mea- 
surement; Reliability 
i coefficient) 
. Reliability coefficient: 
and attenuation, 370-371 
defined, 365 
for difference scores, 371 
effect of test length on, 369 
т experimental psychology, 
373-374 
methods of determining, 
365-368 
- Reliability index, 360 
Repeated measurements, 
241-243, 264-265 
Response patterns, 351-352 


Richardson, M. W., 368, 481 

Rotational methods in factor 
analysis, 421—425 

Ryan, T. A., 268 


in definition of tau, 308 
as measure of disarray, 304 
significance of, 309-311 
standard error of, 309-311 
table of, 459 
in trend analysis, 335—340 
Sample, meaning of, 9-10, 120 
Sampling, 9-10, 120-135 
Proportional stratified sample, 
122 
random, 121—123 
stratified random sampling, 122 
systematic, 121-123 
Sampling distribution: 
of chi square, 175-177 
of correlation coefficient, 167— 
169 
biserial, 359 
of difference, 133-134 
experimental, 125 
of F, 165-166 
of mean: from finite popula- 
tion, 126-128 
from indefinitely large popu- 
lation, 128—131 
meaning, 124-126 
of proportion, 131—133 
of S in the definition of tau, 
309-311, 459 
of t, 140-142 
theoretical, 125 
Sampling error: 
meaning, 123—124 
of product-moment correlation, 
167-169 
Sampling statistics, 10 
Sampling theory, 120-134 
Saunders, D. R., 421, 482 
Scatter diagram, 98-99 
Scheffé, H., 268, 270-271, 274— 
275, 482 
Scheffé method of multiple com- 
parisons, 270-271 
Set, 73-74 
Shrinkage in multiple correlation, 
401—402 
Siegel, Sidney, 315, 323, 342, 482 


Sign tests, 323-326 


Significance: 
levels, 149-150 
meaning, 146-148 
nonparametric tests, 321—343 
rank tests, 326-340 
Small sample statistics, 139, 141- 
142 
Snedecor, G. W., 221, 453n., 482 
Spearman, C., 404 -405 
Spearman-Brown formula, 367, 
369 
Spearman's rank coefficient, 
305—308 
critical values, table, 458 
Specificity, 410 
Split-half method, 366 
Square root transformation, 220 
Squares and square roots, table, 
469—478 
Standard deviation, 61 
adding a constant, 62—63 
advantages, 66 
calculation, 62 
for combined groups, 64 
confidence interval, 144—145 
of first N integers, 63-64 
of measurement error, 371-373 
multiplying by a constant, 
62-63 
standard error, 144—145 
Standard error, 125 
of biserial correlation 
coefficient, 359 
of correlation coefficient, 
168—169 
of difference, 133-134 
for correlated proportion, ' 
162-164 
for independent proportions, 
160-162 
for means of independent 
samples, 133 
for 2,78 (transformed г), 
170-171 
of estimate, 114-115 
of mean: from finite popula- 
tion, 126-128 
from indefinitely large 
population, 128-131 
meaning, 125 
of measurement, 371-373 
of median, 144 
of percentage, 143-144 
of phi coefficient, 351 
of proportion, 131-133, 143 


INDEX 


Standard error: 

of S in definition of tau, 

309-311 

of standard deviation, 144-145 

of z, (transformed r), 169 
Standard score: 

and correlation, 99-100 

defined, 64—66 

sum of squares of, 66 

transformation, 377 
Standardization of tests, 375-389 
Stanine scale, 386 
Statistics: 

as study of population, 6-8 

as study of variation, 8-9 
Stevens, S. S., 14, 482 
Stratified sampling, 122 
"Student" (W. S. Gosset), 141 
Studentized range, 271 

critical values of, 468 
Sufficient estimate, 136, 138 
Summation notation explained, 

19-22 

Swineford, Frances, 416, 480 


1 distribution, 140-142 
t ratio, 140-142 
in comparison of means 
following F test, 269-270 
and confidence limits, 143 
for correlation, 169-172 
critical values, table, 450 
for difference: of correlated 
variances, 167 
of means: for correlated 
samples, 153-155 
for independent samples, 
151-153 
for unequal variances, 
155-157 
for partial correlation, 392 
for point biserial correlation, 
358 
related to F, 218-219 
for Spearman’s rho, 308 
T-score transformation, 383-386 
T scores, 383—386 
Tabular representation, rules, 32 
Test construction statistics, 
346—361 
Test items: 
homogeneity of, 354 
internal consistency of, 
353-354 
mean of, 347-348 


CoM. 


er 


Test items: 
response patterns, 351—352 
selection of, 354-361 
variance of, 347—348 
Test-retest method, 365-366 
Test standardization, 375-389 
Thomson, G. H., 398, 399n., 405. 
482 
Thurstone, L. L., 33n., 405, 411, 
416, 482 
Tied ranks: 
in coefficient of concordance, 
314 
in Kendall's tau, 309 
in Spearman's rho, 306-307 
Ties in rank test: 
for k independent samples, 
332-333 
for two independent samples, 
328 
Torgerson, Warren S., 13, 14, 482 
Transformation, 375-389 
with age allowance, 388—389 
arc sine, 220 
Fisher's z,, 168—171, 456 
logarithmic, 220 
nature, 375-376 
to normal distribution, 383-386 
to percentile ranks, 377-383 
of r to z,, 168-169 
reciprocal, 220 
regression, 386-388 
square root, 220 
to standard scores, 377 
to stanines, 386 
to T scores, 383-386 
Trend analysis, 276-287 
correlated data, 286 
extended applications, 286 
linear trend, 277-280 
meaning of, 276 
nonlinear trend, 280-285 
nonparametric, 335-340 
orthogonal polynomials, 278, 
281-285 
partitioning sum of squares: 
for linear trend, 277-278 
using orthogonal poly- 
nomials, 282—283 
polynomial regression, 280-281 
unequal n’s, 279, 285 
True scores: 
and correlation, 370-371 
defined, 362—363 
variance, 364 
Tsao, Fei, 238, 482 


Tukey, J. W., 268, 274, 482 0 
Two-tailed test, 150—151 > 
Type I error, 148-149, 274-275 — 

Type II error, 148-149, 274-275 j 


U-shaped distribution, 39 4 
О test, Mann-Whitney, 326-329 
Unbiased estimate, 136-137 

of variance, 59 Д 
Uniqueness, 410 ` 
Unit of measurement, 18—19 
Urban, F. M., 5 


Value, expected, 137, 213-214, 
229-232, 254-257 
Variable: 
continuous, 12 
defined, 11 
dependent, 12, 86-87 
discrete, 12 
independent, 12, 86-87 
interval, 13-14, 199-200 
nominal, 13, 199-200 
ordinal, 13, 194-200 
qualitative, 14 
quantitative, 14 
ratio, 13-14, 199-200 
types, 11-14 
Variance: 
additive nature, 103-105 
advantages, 66 
analysis (see Analysis of 
variance) 
biased estimate of, 59 
calculation, 62 
of combined groups, 64 
defined, 58-61 
of difference, 103—105 


364 
estimate (sce Analysis of 
jance, variance estimate) 
homogeneity, 153, 164—165 


significance of difference: 
correlated samples, 167 | 
independent samples, 
164-166 
of sums, 103-105 
of test items, 347-348 
of true scores, 364 
unbiased estimate, 59 
Variate, defined, 12 
Varimax, method of rotation, 
423-425 


обрел, T» Ka fn. Wilcoxon, F., 326, 329, 331, 
р 460n. 
Wilcoxon signed-rank test 
Walker, Helen M., 482 329-331 
" critical values, tables, 460—462 
1 Wilk, М. B., 230, 483 
Welch, B. L., 155-157, 482 Winer, B. J., 206, 219, 240, 268, 
., 48n. 271, 300, 483 
Whitney, D. R., 326, 329, 481 Woo, T. L., 182, 483 
Wilcox, Roberta A., 460n. Wrigley, Charles, 421, 482 


LIBRARY - 


INDEX 

Yates, F., 17, 121, 188, 201, 205, 
281, 450-451n., 457n. 

Yates correction for continuity, 


188 


= score (see Standard score) 

2, transformation, 168-171, 456 
table transforming г to z,, 456 

Zero, absolute, 14 


в = = в“ Ee 
" ! м "n 
О Кк 
A Pe р 
4 . . 
| 7 
, 
L Й i 
LI b 78 
| у 
С - E 
E 
- > D 
i О 
| - - 198 
t ғ d 
. AT x E 
б 4 ur 4 
" и ow 
D 
D SLE E. 
y n 
- b. 


SN... ^ai] a 
WR ay 


\ 


Form No. 3. 
PSY, RES.L-1 

Bureau of Educational & Psychological 
Research Library. 


—_ 
The book is to be returned within 
ге the date stamped last. 


za У?ВОР-59/60-51190-5М 
( 


107) or MEN К 


ызы 2 


OTHER McGRAW-HILL 
INTERNATIONAL STUDENT EDITIONS 
IN RELATED FIELDS 


Blalock: SOCIAL STATISTICS, 2/e 
Chase: ELEMENTARY STATISTICAL PROCEDURES 
D'Amato: EXPERIMENTAL PSYCHOLOGY Methodology, 
Psychophysics and Learning 
Deese; THE PSYCHOLOGY OF LEARNING, 3/e 
Games: ELEMENTARY STATISTICS-Data Analysis for 
the Behavioral Sciences 
Guilford: FUNDAMENTALS STATISTICS IN PSYCHOLOGY 
AND EDUCATION, 5/e 
Goode: METHODS IN SOCIAL RESEARCH 
Nunnally: INTRODUCTION TO PSYCHOLOGICAL MEASUREMENT 
Siegel: NON-PARAMETRIC STATISTICS FOR THE BEHAVIORAL 
SCIENCES 


яз 


407 


